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NOVEL CODING SEQUENCES FROM HERPES SIMPLEX VIRUS TYPE-2 
Field of the Invention : 

This invention relates to newly identified Herpes Simplex Virus type 2 (HSV-2) 
polynucleotides, polypeptides encoded by such polynucleotides, the uses of such 
5 polynucleotides and polypeptides, as well as the production of such polynucleotides and 
polypeptides and recombinant host cells transformed with the polynucleotides. This 
invention also relates to inhibiting the biosynthesis or action of such polynucleotides or 
polypeptides and to the use of such inhibitors in therapy of viral infections or related 
diseases. 

10 Background of the Invention: 

The herpes viruses consist of large icosahedral enveloped virions containing linear 
double stranded DNA genomes. Currently, eight human herpes viruses have been isolated 
and are known to be responsible for a variety of disease states, from sub-clinical infections 
to fatal disease in the immuno compromised. One human herpes virus, herpes simplex 

15 virus type 2, designated HSV-2, is usually acquired through sexual contact giving rise to the 
condition known as genital herpes. The frequency of recurrence of secondary genital herpes 
ranges between one and six times per year per infected individual. It is estimated that 
genital HSV-2 infections occur in ten to sixty million individuals in the USA. Less 
frequently, HSV-2 infection results in herpes labialis, seen as cold sores. 

20 General information about HSV-2 may be found in various treatises such as, Herpes 

Simplex Viruses, In: "Field's Virology", 3rd ed., Lippincott-Raven Pub!, pp2297-2342 
(1996); Magder, L.S., et aL New. England J. Med. 321:7-12 (1989); and "The Human 
Herpes viruses", Roizman, B. et al., eds. Raven Press, New York, (1993), the contents of 
which are incorporated herein by reference for purposes of background. 

25 Currently, there are no vaccines available to protect against HSV-2 infection. 

Individuals continue to become infected by the virus and no completely satisfactory anti- 
viral agents or vaccines are available. Thus HSV-2 presents a major public health problem. 
There is a need for prophylactic and therapeutic vaccines as well as a method of identifying 
anti-HSV-2 agents and for reagents useful in such methods. There is a need for a method of 

30 identifying compounds which modulate the activity of HSV-2 polynucleotides and proteins 
and which affect the ability of the virus to replicate and produce multiple infectious virions 
in an infected cell. There is a need for methods of, and kits for, distinguishing HSV-2 
infections from other herpes virus infections. 
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Brief Description of the Invention: 

Toward these ends, it is an object of the present invention to provide polypeptides, inter 
alia, that have been identified as novel HSV^-2 polypeptides by comparison between the amino 
acid sequences set out in Tables 1-4 and known amino acid sequences of proteins of other viruses 
5 such as herpes simplex virus type-1 (HSV-1). 

It is a further object of the invention, to provide polynucleotides that encode HS V-2 
proteins, particularly polynucleotides that encode the polypeptides encoded by the Open Reading 
Frames (ORFs) provided herein, or fragments, analogs or derivatives thereof. 

In a particularly preferred embodiment of this aspect of the invention the 
1 0 polynucleotides comprise any of the regions encoding HSV-2 proteins in the sequences set out in 
Tables 1-4, including fragments, analogs or derivatives thereof. 

In another particularly preferred embodiment of the present invention, there is a novel 
HSV-2 protein comprising any of the amino acid sequences shown in Table 1, or fragments, 
analogues or derivatives thereof 
15 In accordance with the invention there is provided an isolated nucleic acid molecule 

encoding a polypeptide expressible by the HSV-2 polynucleotide contained in the deposited 
HSV-2 strain, SB5. 

In accordance with the invention there are provided isolated nucleic acid molecules 
encoding HSV-2 proteins, nucleic acid molecules such as, mRNAs, cDNAs, genomic DNAs 
20 and, in further embodiments of this aspect of the invention, biologically, diagnostically, clinically 
or therapeutically useful variants, analogs or derivatives thereof, or fragments thereof, including 
fragments of the variants, analogs and derivatives. 

Among the particularly preferred embodiments of this aspect of the invention are 
naturally occurring allelic variants of HSV-2 proteins. 
25 In accordance with this aspect of the invention there are provided novel polypeptides of 

HSV-2 origin as well as biologically, diagnostically or therapeutically useful fragments thereof, 
as well as variants, derivatives and analogs of the foregoing and fragments thereof. 

In accordance with certain preferred embodiments of this and other aspects of the 
invention there are probes that hybridize to HSV-2 sequences useful for detection of viral 
30 infection. 

It also is an object of the invention to provide HSV-2 polypeptides or fragments thereof 
that may be employed for therapeutic or prophylactic purposes, for example, to treat disease, 
including treatment by conferring host immunity against viral infections, or as an antiviral agent 
or a vaccine. 
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In accordance with another aspect of the present invention, there is provided the use 
of a polynucleotide of the invention for therapeutic or prophylactic purposes, in particular 
genetic immunization. 

Among the particularly preferred embodiments of this aspect of the invention are 
5 variants of HSV-2 polypeptides encoded by naturally occurring alleles of HSV-2 genes for 
therapeutic or prophylactic use. 

It is another object of the invention to provide a process for producing the 
aforementioned polypeptides, polypeptide fragments, variants and derivatives, fragments of the 
variants and derivatives, and analogs thereof. 
10 In a preferred embodiment of this aspect of the invention there are provided methods for 

producing the aforementioned HSV-2 polypeptides comprising culturing host cells having 
expressibly incorporated therein an exogenously-derived HSV-2 encoding polynucleotide under 
conditions for expression of HSV-2 in the host and then recovering the expressed polypeptide. 

In accordance with another object of the invention there are provided products, 
15 compositions, processes and methods that utilize the aforementioned polypeptides and 
polynucleotides, inter alia, for research, biological, clinical, diagnostic, prophylatic and 
therapeutic purposes. 

In accordance with yet another aspect of the present invention, there are provided 
inhibitors of such polypeptides, useful as antiviral agents. In particular, there are provided 

20 antibodies against such polypeptides. In certain particularly preferred embodiments in this 
regard, the antibodies are selective for HSV-2. 

In a further aspect of the invention there are provided compositions comprising a HSV-2 
polynucleotide or HSV-2 polypeptide for administration to cells in vitro, to cells ex vivo and to 
cells in vivo, or to a multicellular organism. In certain preferred embodiments of this aspect of 

25 the invention, the compositions comprise a HSV-2 polynucleotide for expression of a HSV-2 

polypeptide in a host organism to raise an immunological response, preferably to raise immunity 
in such host against HSV-2 or related organisms. 

Other objects, features, advantages and aspects of the present invention will become 
apparent to those of skill from the following description. It should be understood, however, that 

30 the following description and the specific examples, while indicating preferred embodiments of 
the invention, are given by way of illustration only. Various changes and modifications within 
the spirit and scope of the disclosed invention will become readily apparent to those skilled in the 
art from reading the following description and from reading the other parts of the present 
disclosure. 
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Detailed Description of the Invention: 

Tables 1-3 show the nucleotide sequences of one strand of "contigs," prepared by 
assembling sequences derived by sequencing HSV-2, Strain SB5, DNA. Collectively, the 
contigs herein represent between 85% to over 90% of the genome of this organism. Each 
5 of Table 1, 2 and 3 represents a separate sequencing of the HSV-2, SB5, DNA. 

Tables 1-3 also show the nucleotide sequences of open reading frames (ORFs), 
which are deduced DNA coding sequences present within each contig. Tables 1-4 also 
show the deduced amino acid sequences of polypeptides encoded by these ORFs and 
sequence homologies to proteins in the NCBI non-redundant protein database. 

10 Each ORF represents a HSV-2 gene although in some cases, a given ORF may 

actually have been derived from a gene that is longer than the ORF. 

Each of the DNA sequences provided herein may be used in the discovery and 
development of antiviral compounds. For sequences containing an open reading frame 
(ORF) with appropriate initiation and termination codons, the encoded protein upon 

15 expression can be used as a target for the screening of antiviral drugs. Additionally, the 
DNA sequences encoding preferably the amino terminal regions of the encoded protein, or 
regions immediately upstream therefrom, can be used to construct antisense sequences to 
control the expression of the coding sequence of interest. Furthermore, many of the 
sequences disclosed herein also provide regions upstream and downstream from the 

20 encoding sequence. These sequences are useful as a source of regulatory elements for the 
control of viral gene expression. Such sequences are conveniently isolated by restriction 
enzyme action or synthesized chemically and introduced, for example, into promoter 
identification strains. These strains contain a reporter structural gene sequence located 
downstream from a restriction site such that if an active promoter is inserted, the reporter 

25 gene will be expressed. 

Although each of the sequences may be employed as described above, this 
invention also provides several means for identifying particularly useful target genes. The 
first of these approaches entails searching appropriate databases for sequence matches in 
related organisms. Thus, if a homologue exists, the HSV-2-like form of this gene would 

30 likely play an analogous role. For example, a HSV-2 protein identified as homologous to a 
cell surface protein in another organism would be useful as a vaccine candidate. To the 
extent such homologies have been identified for the sequences disclosed herein they are 
reported along with the encoding sequence in the Tables. 

A number of methods can be used to identify genes which are essential to survival 

35 per se> or essential to the establishment/maintenance of an infection. Identification of an 
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ORF unknown by one of these methods yields additional information about its function and 
permits the selection of such an ORF for further development as a screening target. Briefly, 
these approaches include: generation of temperature sensitive mutations (Weller, S.K., et 
a LVirologv 130:290-305 (1983)), site specific insertion or deletion of a viral gene; a 
5 method based on selection of recombinant molecules generated by double recombination 
through homologous sequencees between intact viral DNA molecules and a DNA fragment 
containing an insertion or deletion and a selectable marker (Post, L.E., et al., £ei]_25:227- 
32 (1981)), and also by insertional mutagenesis using transposons; a method taking 
advantage of the random insertion of the DNA phage miniMu into target plasmid DNAs 

10 (Jenkins, F.J., et al., Proc. Natl Acad. Sci. USA 82:4773-4777 (1985)). Each of these 

techniques may have advantages or disadvantages depending on the particular application. 
The skilled artisan would choose the approach that is the most relevant with the particular 
end use in mind. For example, some genes might be recognised as essential for infection 
but in reality are only necessary for the initiation of infection and so their products would 

1 5 represent relatively unattractive targets for antivirals developed to cure established and 
chronic infections. 

Use of these technologies when applied to the ORFs of the present invention 
enables identification of viral proteins expressed during infection, inhibitors of which would 
have utility in antiviral therapy. 
20 Glossary: 

The following explanations are provided to facilitate understanding of certain terms used 
frequently herein, particularly in the Examples. The explanations are provided as a convenience 
and are not limitative of the invention. 

HSV-2 BINDING MOLECULE, as used herein, refers to molecules or ions which bind 
25 or interact specifically with HSV-2 polypeptides or polynucleotides of the present invention, 
including, for example, enzyme substrates, cell membrane components and classical receptors. 
Binding between polypeptides of the invention and such molecules, including binding or 
interaction molecules may be exclusive to polypeptides of the invention, which is preferred, or it 
may be highly specific for polypeptides of the invention, which is also preferred, or it may be 
30 highly specific to a group of proteins that includes polypeptides of the invention, which is 
preferred, or it may be specific to several groups of proteins at least one of which includes a 
polypeptide of the invention. Binding molecules also include antibodies and antibody-derived 
reagents that bind specifically to polypeptides of the invention. 

GENETIC ELEMENT generally means a polynucleotide comprising a region that is 
35 important to the viral life cycle, a polynucleotide comprising a region that encodes a polypeptide 
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or a polynucleotide region that regulates replication, transcription or translation or other 
processes important to expression of the polypeptide in a host cell, or a polynucleotide 
comprising both a region that encodes a polypeptide and a region operably linked thereto that 
regulates expression. Genetic elements may be comprised within a vector that replicates as an 
5 episomal element; that is, as a molecule physically independent of the host cell genome. They 
may be comprised within plasmids. Genetic elements also may be comprised within a host cell 
genome; not in their natural state but, rather, following manipulation such as isolation, cloning 
and introduction into a host cell in the form of purified DNA or in a vector, among others. 

HOST CELL is a cell which has been transformed or transfected, or is capable of 

10 transformation or transfection by an exogenous polynucleotide sequence. 

IDENTITY as known in the art, is a relationship between two or more polypeptide 
sequences or two or more polynucleotide sequences, as determined by comparing the sequences. 
In the art, "identity" also means the degree of sequence relatedness between polypeptide or 
polynucleotide sequences, as the case may be, as determined by the match between strings 

15 of such sequences. "Identity" and "similarity" can be readily calculated by known methods, 
including but not limited to those described in (Computational Molecular Biology, Lesk, 
A.M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and 
Genome Projects, Smith, D.W., ed., Academic Press, New York, 1993; Computer Analysis 
of Sequence Data, Part I, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 

20 1994; Sequence Analysis in Molecular Biology, von Heinje, G„ Academic Press, 1987; and 
Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New 
York, 1991; and Carillo, H., and Lipman, D., SIAM J. Applied Math, 48: 1073 (1988). 
Preferred methods to determine identity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in publicly 

25 available computer programs. Preferred computer program methods to determine identity 
and similarity between two sequences include, but are not limited to, the GCG program 
package (Devereux, J., et al., Nucleic Acids Research 12(1): 387 (1984)), BLASTP, 
BLASTN, and FASTA (Atschul, S.F. et al., J. Molec. Biol. 215: 403-410 (1990). The 
BLAST X program is publicly available from NCBI and other sources (BLAST Manual, 

30 Altschul, S., et al., NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol Biol. 
215: 403-410 (1990). As an illustration, by a polynucleotide having a nucleotide sequence 
having at least, for example, 95% "identity" to a reference nucleotide sequence it is 
intended that the nucleotide sequence of the tested polynucleotide is identical to the 
reference sequence except that the polynucleotide sequence may include up to five point 

35 mutations per each 100 nucleotides of the reference nucleotide sequence. In other words, to 
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obtain a polynucleotide having a nucleotide sequence at least 95% identical to a reference 
nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be deleted 
or substituted with another nucleotide, or a number of nucleotides up to 5% of the total 
nucleotides in the reference sequence may be inserted into the reference sequence. These 
5 mutations of the reference sequence may occur at the 5' or 3* terminal positions of the 
reference nucleotide sequence or anywhere between those terminal positions, interspersed 
either individually among nucleotides in the reference sequence or in one or more 
contiguous groups within the reference sequence. Analogously, by a polypeptide having an 
amino acid sequence having at least, for example, 95% identity to a reference amino acid 

10 sequence is intended that the test amino acid sequence of the polypeptide is identical to the 
reference sequence except that the polypeptide sequence may include up to five amino acid 
alterations per each 100 amino acids of the reference amino acid. In other words, to obtain 
a polypeptide having an amino acid sequence at least 95% identical to a reference amino 
acid sequence, up to 5% of the amino acid residues in the reference sequence may be 

15 deleted or substituted with another amino acid, or a number of amino acids up to 5% of the 
total amino acid residues in the reference sequence may be inserted into the reference 
sequence. These alterations of the reference sequence may occur at the amino or carboxy 
terminal positions of the reference amino acid sequence or anywhere between those terminal 
positions, interspersed either individually among residues in the reference sequence or in 

20 one or more contiguous groups within the reference sequence. 

ISOLATED means altered "by the hand of man" from its natural state; i.e., that, if it 
occurs in nature, it has been changed or removed from its original environment, or both. For 
example, a naturally occurring polynucleotide or a polypeptide naturally present in a living 
organism in its natural state is not "isolated," but the same polynucleotide or polypeptide 

25 separated from the coexisting materials of its natural state is "isolated", as the term is employed 
herein. For example, with respect to polynucleotides, the term isolated means that it is separated 
from the genome and cell in which it naturally occurs. As part of or following isolation, such 
polynucleotides can be joined to other polynucleotides, such as DNAs, for mutagenesis, to form 
fusion proteins, and for propagation or expression in a host, for instance. The isolated 

30 polynucleotides, alone or joined to other polynucleotides such as vectors, can be introduced into 
host cells, in culture or in whole organisms. Introduced into host cells in culture or in whole 
organisms, such DNAs still would be isolated, as the term is used herein, because they would not 
be in their naturally occurring form or environment. Similarly, the polynucleotides and 
polypeptides may occur in a composition, such as a media formulations, solutions for 

35 introduction of polynucleotides or polypeptides, for example, into cells, compositions or 
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solutions for chemical or enzymatic reactions, for instance, which are not naturally occurring 
compositions, and, therein remain isolated polynucleotides or polypeptides within the meaning 
of that term as it is employed herein. 

POLYNUCLEOTIDE^) generally refers to any polyribonucleotide or 
5 polydeoxribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. 
Thus, for instance, polynucleotides as used herein refers to, among others, single-and double- 
stranded DNA, DNA that is a mixture of single- and double-stranded regions or single-, double- 
and triple-stranded regions, single- and double-stranded RNA, and RNA that is mixture of 
single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be 

10 single-stranded or, more typically, double-stranded, or triple-stranded, or a mixture of single- and 
double-stranded regions. In addition, polynucleotide as used herein refers to triple-stranded 
regions comprising RNA or DNA or both RNA and DNA. The strands in such regions may be 
from the same molecule or from different molecules. The regions may include all of one or more 
of the molecules, but more typically involve only a region of some of the molecules. One of the 

1 5 molecules of a triple-helical region often is an oligonucleotide. As used herein, the term 

polynucleotide includes DNAs or RNAs as described above that contain one or more modified 
bases. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are 
"polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising 
unusual bases, such as inosine, or modified bases, such as trity lated bases, to name just two 

20 examples, are polynucleotides as the term is used herein. It will be appreciated that a great 
variety of modifications have been made to DNA and RNA that serve many useful purposes 
known to those of skill in the art. The term polynucleotide as it is employed herein embraces 
such chemically, enzymatically or metabolicaily modified forms of polynucleotides, as well as 
the chemical forms of DNA and RNA characteristic of viruses and cells, including inter alia, 

25 simple and complex cells. Hie term polynucleotide^) embrace short polynucleotides often 
referred as oligonucleotides. 

POLYPEPTIDES, as used herein, includes all polypeptides as described below. The 
basic structure of polypeptides is well known and has been described in innumerable textbooks 
and other publications in the art. In this context, the term is used herein to refer to any peptide or 

30 protein comprising two or more amino acids joined to each other in a linear chain by peptide 

bonds. As used herein, the term refers to both short chains, which also commonly are referred to 
in the art as peptides, oligopeptides and oligomers, for example, and to longer chains, which 
generally are referred to in the art as proteins, of which there are many types. It will be 
appreciated that polypeptides often contain amino acids other than the 20 amino acids commonly 

35 referred to as the 20 naturally occurring amino acids, and that many amino acids, including the 
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terminal amino acids, may be modified in a given polypeptide, either by natural processes, such 
as processing and other post-translational modifications, but also by chemical modification 
techniques which are well known to the art. Even the common modifications that occur naturally 
in polypeptides are too numerous to list exhaustively here, but they are well described in basic 
5 texts and in more detailed monographs, as well as in a voluminous research literature, and they 
are well known to those of skill in the art. Among the known modifications which may be 
present in polypeptides of the present are, to name an illustrative few, acetylation, acylation, 
ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme 
moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a 

10 lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, 
disulfide bond formation, demethylation, formation of covalent cross-links, formation of cystine, 
formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor 
formation, hydroxylation, iodination, methylation, myristoylation, oxidation, proteolytic 
processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA 

15 mediated addition of amino acids to proteins such as arginylation, and ubiquitination. Such 
modifications are well known to those of skill and have been described in great detail in the 
scientific literature. Several particularly common modifications, glycosylation, lipid attachment, 
sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation, 
for instance, are described in most basic texts, such as, for instance PROTEINS - STRUCTURE 

20 AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W. H. Freeman and Company, 
New York (1993). Many detailed reviews are available on this subject, such as, for example, 
those provided by Wold, F., Posttranslational Protein Modifications: Perspectives and Prospects, 
pgs. 1-12 in POSTTRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B. C. 
Johnson, Ed., Academic Press, New York(1983); Seifter et aL Meth. EnzvmoL 182:626-646 

25 (1990) and Rattan et aL, Protein Synthesis: Posttranslational Modifications and Aging, Ann. 
N.Y. Acad. Sci . 663: 48-62 (1992). It will be appreciated, as is well known and as noted above, 
that polypeptides are not always entirely linear. For instance, polypeptides may be branched as a 
result of ubiquitination, and they may be circular, with or without branching, generally as a result 
of posttranslation events, including natural processing event and events brought about by human 

30 manipulation which do not occur naturally. Circular, branched and branched circular 

polypeptides may be synthesized by non-translation natural process and by entirely synthetic 
methods, as well. Modifications can occur anywhere in a polypeptide, including the peptide 
backbone, the amino acid side-chains and the amino or carboxyl termini. In fact, blockage of the 
amino or carboxyl group in a polypeptide, or both, by a covalent modification, is common in 

35 naturally occurring and synthetic polypeptides and such modifications may be present in 
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polypeptides of the present invention, as well. For instance, the amino terminal residue of 
polypeptides made in E. coli or other cells, prior to proteolytic processing, almost invariably will 
beN-formylmethionine. During post-translational modification of the peptide, a methionine 
residue at the NH 2 -terminus may be deleted. Accordingly, this invention contemplates the 
5 use of both the methionine-containing and the methionineless amino terminal variants of 
the protein of the invention. The modifications that occur in a polypeptide often will be a 
function of how it is made. For polypeptides made by expressing a cloned gene in a host, for 
instance, the nature and extent of the modifications in large part will be determined by the host 
cell posttranslational modification capacity and the modification signals present in the 

10 polypeptide amino acid sequence. For instance, as is well known, glycosylation often does not 
occur in bacterial hosts such as, for example, E. coli. Accordingly, when glycosylation is 
desired, a polypeptide should be expressed in a glycosylating host, generally a eukaryotic cell. 
Insect cells often carry out the same posttranslational glycosylates as do mammalian cells and, 
for this reason, insect cell expression systems have been developed to express efficiently 

1 5 mammalian proteins having native patterns of glycosylation. Similar considerations apply to 
other modifications. It will be appreciated that the same type of modification may be present in 
the same or varying degree at several sites in a given polypeptide. Also, a given polypeptide 
may contain many types of modifications. In general, as used herein, the term polypeptide 
encompasses all such modifications, particularly those that are present in polypeptides 

20 synthesized by expressing a polynucleotide in a host cell. 

VARIANT(S) as the term is used herein, is a polynucleotide or polypeptide that 
differs from a reference polynucleotide or polypeptide respectively, but retains essential 
properties. A typical variant of a polynucleotide differs in nucleotide sequence from 
another, reference polynucleotide. Changes in the nucleotide sequence of the variant may 

25 or may not alter the amino acid sequence of a polypeptide encoded by the reference 
polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, 
deletions, fusions and truncations in the polypeptide encoded by the reference sequence, as 
discussed below. A typical variant of a polypeptide differs in amino acid sequence from 
another, reference polypeptide. Generally, differences are limited so that the sequences of 

30 the reference polypeptide and the variant are closely similar overall and, in many regions, 
identical. A variant and reference polypeptide may differ in amino acid sequence by one or 
more substitutions, additions, deletions in any combination. A substituted or inserted 
amino acid residue may or may not be one encoded by the genetic code. A variant of a 
polynucleotide or polypeptide may be a naturally occurring such as an allelic variant, or it 

35 may be a variant that is not known to occur naturally. Non-naturally occurring variants of 
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polynucleotides and polypeptides may be made by mutagenesis techniques, by direct 
synthesis, and by other recombinant methods known to skilled artisans. 
Deposit: 

HSV-2, strain SB5 has been deposited at the American Type Culture Collection 
5 under accession number VR-2546 on October 31, 1996. 

The deposits referred to herein will be maintained under the terms of the Budapest 
Treaty on the International Recognition of the Deposit of Micro-organisms for Purposes of 
Patent Procedure. These deposits are provided merely as convenience to those of skill in 
the art and are not an admission that a deposit is required under 35 U.S.C. §1 12. The 

10 sequence of the polynucleotides contained in the deposited material, as well as the amino 
acid sequence of the polypeptides encoded thereby, are incorporated herein by reference and 
are controlling in the event of any conflict with any description of sequences herein. A 
license may be required to make, use or sell the deposited material, and no such license is 
hereby granted. 

15 Viral Strain and Genome: 

The nucleotide sequences disclosed herein can be obtained by synthetic chemical 
techniques known in the art or can be obtained from HSV-2, strain SB5 by probing a DNA 
preparation with probes constructed from the particular sequences disclosed herein. 
Alternatively, oligonucleotides derived from a disclosed sequence can act as PCR primers 

20 in a process of PCR-based cloning of the sequence from a viral genomic source. It is 

recognised that such sequences will also have utility in diagnosis of the stage of infection 
and type of infection the pathogen has attained. 

The present invention relates to novel HSV-2 polypeptides and polynucleotides 
encoding same, among other things, as described in greater detail below. The invention relates 

25 especially to HSV-2 molecules having the nucleotide and amino acid sequences set out in Tables 
1-4 and to the HSV-2 nucleotide and amino acid sequences of the DNA isolatable from Deposit 
No. ATTC VR-2546, which is herein referred to as "the deposited organism" or as the "DNA of 
the deposited organism." It will be appreciated that the nucleotide and amino acid sequences set 
out in Tabled 1-4 were obtained by sequencing the DNA of the deposited organism. Hence, the 

30 sequence of the deposited clone is controlling as to any discrepancies between it (and the 
sequence it encodes) and the sequences of the Tables. 

The present invention also relates to additional polynucleotide sequences disclosed 
herein, which are RNAs transcribed from the DNAs disclosed herein but which may or may not 
be translated into protein. Such polynucleotides are known in HSV-1 and other herpes viruses. 
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Polynucleotides 

In accordance with one aspect of the present invention, there are provided isolated 
polynucleotides which encode HSV-2 polypeptides having the deduced amino acid sequence of 
Tables 1-4. It is preferred that these polynucleotides be one of those set forth in Tables 1, 2 or 3. 
5 The skilled artisan can readily determine the polynucleotide sequence of such preferred 
polynucleotides by reference to the ORF start and stop positions set forth in Tables 1-4. 

Using the information provided herein, such as the polynucleotide sequence set out in 
Tables 1-3, a polynucleotide of the present invention encoding HSV-2 polypeptide may be 
obtained using standard cloning and screening procedures. To obtain the polynucleotide 

10 encoding the protein using the DNA sequences given in Tables 1-3 typically a library of clones 
of chromosomal DNA of HSV-2 strain SB5 in K coli or some other suitable host is probed with 
a radiolabelled oligonucleotide, preferably a 17mer or longer, derived from a sequence of Tables 
1-3. Clones carrying DNA identical to that of the probe can then be distinguished using high 
stringency washes. By sequencing the individual clones thus identified with sequencing primers 

1 5 designed from the original sequence it is then possible to extend the sequence in both directions 
to determine the full gene sequence. Conveniently such sequencing is performed using 
denatured double stranded DNA prepared from a plasmid clone. Suitable techniques are 
described by Maniatis, T., Fritsch, E.F. and Sambrook, J. in MOLECULAR CLONING, A 
Laboratory Manual (2nd edition 1989 Cold Spring Harbor Laboratory, see Screening By 

20 Hybridization 1.90 and Sequencing Denatured Double-Stranded DNA Templates 13.70 

The DNA sequences set out in Tables I, 2 and 3 each contain at least one open reading 
frame encoding a protein having at least about the number of amino acid residues set forth in 
Table 1-3. The start and stop codons of each open reading frame are the first three and the last 
three nuclotides of each polynucleotide set forth in Table 1, 2 and 3. 

25 Certain HSV-2 sequences of the invention are structurally related to sequences encoding 

other proteins of the herpes family, as shown by comparing the sequences of the Tables with that 
of sequences reported in the literature. Moreover, certain polynucleotides and polypeptides of 
the invention are structurally related to known. These proteins exhibit greatest homology to the 
homologue listed in Tables 1, 2, 3 and 4 from among the known proteins. 

30 The invention provides a polynucleotide sequence identical over its entire length to each 

coding sequence in Tables 1-3. Also provided by the invention is the coding sequence for the 
mature polypeptide or a fragment thereof, by itself as well as the coding sequence for the mature 
polypeptide or a fragment in reading frame with other coding sequence, such as those encoding a 
leader or secretory sequence, a pre-, or pro- or prepro- protein sequence. The polynucleotide 

35 may also contain non-coding sequences, including for example, but not limited to non-coding 5' 
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and 3' sequences, such as the transcribed, non-translated sequences, termination signals, 
ribosome binding sites, sequences that stabilize mRNA, introns, polyadenylation signals, and 
additional coding sequence which encode additional amino acids. For example, a marker 
sequence that facilitates purification of the fused polypeptide can be encoded. In certain 
5 embodiments of the invention, the marker sequence is a hexa-histidine peptide, as provided in 
the pQE vector (Qiagen, Inc.) and described in Gentz et ai y Proc. Natl Acad. ScL, USA 86: 821- 
824 (1989), or an HA tag (Wilson et al y Cell 37: 767 (1984). Polynucleotides of the invention 
also include, but are not limited to, polynucleotides comprising a structural gene and its naturally 
associated sequences that control gene expression. 

1 0 The invention also includes polynucleotides of the formula: 

X.(R 1 ) m -(R 2 )-(R 3 )n-Y 
wherein, at the 5 1 end of the molecule, X is hydrogen, and at the 3' end of the molecule, Y is 
hydrogen or a metal, R] and R3 is any nucleic acid residue, n and/or m is an integer between 1 
and 3000 or zero, and R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid 

15 sequence selected from the group set forth in Tables 1, 2 and 3, as well as a ORF sequence 
selected from the group set forth in Tables 1, 2, 3 and 4 (as indicated by the reading frame 
numbering). In the polynucleotide formula above R 2 is oriented so that its 5' end residue is at the 
left, bound to Rj and its 3' end residue is at the right, bound to R3. Any stretch of nucleic acid 
residues denoted by either R group, where n and/or m is greater than 1, may be either a 

20 heteropolymer or a homopolymer, preferably a heteropolymer. In a preferred embodiment n 
and/or m is an integer between 1 and 1000, or 2000 or 3000. 

The term "polynucleotide encoding a polypeptide" as used herein encompasses 
polynucleotides that include a sequence encoding a polypeptide of the invention, particularly a 
viral polypeptide and more particularly a polypeptide of the HSV-2 having an amino acid 

25 sequence set out in Table 1, 2, 3 or 4. The term also encompasses polynucleotides that include a 
single continuous region or discontinuous regions encoding the polypeptide (for example, 
interrupted by integrated phage or an insertion sequence or editing) together with additional 
regions, that also may contain coding and/or non-coding sequences. 

The invention further relates to variants of the polynucleotides described herein that 

30 encode for variants of the polypeptide having the deduced amino acid sequence of Tables 1, 2, 3 
and 4. Variants that are fragments of the polynucleotides of the invention may be used to 
synthesize full-length polynucleotides of the invention. 

Polynucleotides of the present invention may be in the form of RNA, such as mRNA, or 
in the form of DNA, including, for instance, cDNA and genomic DNA obtained by cloning or 

35 produced by chemical synthetic techniques or by a combination thereof. The DNA may be 
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double-stranded or single-stranded. Single-stranded DNA may be the coding strand, also known 
as the sense strand, or it may be the non-coding strand, also referred to as the anti-sense strand. 

The coding sequence which encodes the polypeptide may be identical to the coding 
sequence of the polynucleotide shown in Tables 1-4. It also may be a polynucleotide with a 
5 different sequence, which, as a result of the redundancy (degeneracy) of the genetic code, 
encodes the polypeptides of Tables 1-4. 

Particularly preferred embodiments are polynucleotides encoding polypeptide variants, 
that have the amino acid sequence of a polypeptide of Tables 1, 2, 3 and/or 4 in which several, a 
few, 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues are substituted, deleted or added, in 

10 any combination. Especially preferred among these are silent substitutions, additions and 
deletions, that do not alter the properties and activities of such polynucleotide. 

Further preferred embodiments of the invention are polynucleotides that are at least 
50%, 60% or 70% identical over their entire length to a polynucleotide encoding a polypeptide 
having the amino acid sequence set out in Tables 1, 2, 3 or 4, and polynucleotides that are 

15 complementary to such polynucleotides. Alternatively, most highly preferred are 
polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the deposited strain and polynucleotides 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length to the same are particularly preferred, and among these particularly preferred 

20 polynucleotides, those with at least 95% are especially preferred. Furthermore, those with at 
least 97% are highly preferred among those with at least 95%, and among these those with at 
least 98% and at least 99% are particularly highly preferred, with at least 99% being the most 
preferred. 

A preferred embodiment is an isolated polynucleotide comprising a polynucleotide 
25 sequence selected from the group consisting of: a polynucleotide having at least a 50% identity 
to a polynucleotide encoding a polypeptide comprising the amino acid sequence of Tables 1, 2, 
3, or 4 and obtained from a prokaryotic species other than HS V-2; and a polynucleotide encoding 
a polypeptide comprising an amino acid sequence which is at least 50% identical to the amino 
acid sequence of Tables 1, 2, 3 or 4 and obtained from a prokaryotic species other than HSV-2. 
30 Preferred embodiments are polynucleotides that encode polypeptides that retain 

substantially the same biological function or activity as the mature polypeptide encoded by the 
DNA of Tables 1,2, 3 or 4. 
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The invention further relates to polynucleotides that hybridize to the herein above- 
described sequences. In this regard, the invention especially relates to polynucleotides that 
hybridize under stringent conditions to the -herein above-described polynucleotides. As herein 
used, the terms "stringent conditions" and "stringent hybridization conditions" mean 
5 hybridization will occur only if there is at least 95% and preferably at least 97% identity between 
the sequences. An example of stringent hybridization conditions is overnight incubation at 
42°C in a solution comprising: 50% formamide, 5x SSC (150mM NaCl, 15mM trisodium 
citrate), 50 mM sodium phosphate (pH7.6), 5x Denhardt's solution, 10% dextran sulfate, 
and 20 micrograms/ml denatured, sheared salmon sperm DNA, followed by washing the 

10 hybridization support in 0.1 x SSC at about 65°C. Hybridization and wash conditions are 
well known and exemplified in Sambrook, etal, Molecular Cloning: A Laboratory Manual, 
Second Edition, Cold Spring Harbor, N.Y., (1989), particularly Chapter 1 1 therein. 

The invention also provides a polynucleotide consisting essentially of a 
polynucleotide sequence obtainable by screening an appropriate library containing the 

15 complete gene for a polynucleotide sequence set forth in Tables 1, 2, 3 or 4 under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence 
or a fragment thereof; and isolating said DNA sequence. Fragments useful for obtaining 
such a polynucleotide include, for example, probes and primers described elsewhere herein. 
As discussed additionally herein regarding polynucleotide assays of the invention, for 

20 instance, polynucleotides of the invention as discussed above, may be used as a hybridization 
probe for RNA, cDNA and genomic DNA to isolate full-length cDNAs and genomic clones 
encoding a polypeptide and to isolate cDNA and genomic clones of other genes that have a high 
sequence similarity to a polynucleotide set forth in Table 1, 2, 3 or 4. Such probes generally 
will comprise at least 15 bases. Preferably, such probes will have at least 30 bases and may have 

25 at least 50 bases. Particularly preferred probes will have at least 30 bases and will have 50 bases 
or less. 

For example, the coding region of each gene that comprises or is comprised by a 
polynucleotide set forth in Table 1, 2, 3 or 4 may be isolated by screening using a DNA sequence 
provided in Table 1, 2, 3 or 4 to synthesize an oligonucleotide probe. A labeled oligonucleotide 
30 having a sequence complementary to that of a gene of the invention is then used to screen a 
library of cDNA, genomic DNA or mRNA to determine which members of the library the probe 
hybridizes to. 
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Polynucleotides of the invention that are oligonucleotides derived from the a 
polynucleotide or polypeptide sequence set forth in Table 1, 2, 3 or 4 may be used in the 
processes herein as described, but preferably for PCR, to determine whether or not the 
polynucleotides identified herein in whole or in part are transcribed in virus in infected 
5 tissue. It is recognized that such sequences will also have utility in diagnosis of the stage of 
infection and type of infection the pathogen has attained. 

The invention also provides polynucleotides that may encode a polypeptide that is the 
mature protein plus additional amino or carboxyl-terrninal amino acids, or amino acids interior to 
the mature polypeptide (when the mature form has more than one polypeptide chain, for 
10 instance). Such sequences may play a role in processing of a protein from precursor to a mature 
form, may allow protein transport, may lengthen or shorten protein half-life or may facilitate 
manipulation of a protein for assay or production, among other things. As generally is the case 
in vivo, the additional amino acids may be processed away from the mature protein by cellular 
enzymes. 

15 A precursor protein, having the mature form of the polypeptide fused to one or more 

prosequences may be an inactive form of the polypeptide. When prosequences are removed such 
inactive precursors generally are activated. Some or all of the prosequences may be removed 
before activation. Generally, such precursors are called proproteins. 

The DNA may also comprise a promoter region which functions to direct the 

20 transcription of the mRNA encoding the HS V-2 of this invention . Such promoters may be 

independently useful to direct the transcription of heterologous genes in recombinant expression 
systems. Polyadenylation and splicing signal sequences are also present in the polynucleotide 
sequence and may be useful as gene expression signal in heterologous gene expression vectors 
and constructs. 

25 The polynucleotides and polypeptides of the invention may be employed, for example, 

as research reagents and materials for discovery of treatments of and diagnostics for disease, 
particularly human disease, as further discussed herein relating to polynucleotide assays. 

The polynucleotides of the invention that are oligonucleotides may also be used as 
nucleic acid amplification primers, such as PCR primers, in the process herein described to 

30 determine whether or not the HS V-2 genes identified herein in whole or in part are present 
or transcribed in infected tissue. It is recognized that such sequences will also have utility 
in diagnosis of the stage of infection and type of infection the pathogen has attained. 

In addition to the uses mentioned above for the polynucleotides of this invention, the 
following applications are also contemplated by this invention. Inter alia, the 

35 polynucleotides disclosed herein or portions thereof, may be used as probes to discover 
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mRNA transcripts synthesized during productive and latent HSV-2 infections, for 
example by Northern blot, nuclease protection, and primer extension experiments. 
Novel transcripts in turn can lead to the discovery of new HSV-2 proteins not deducible 
from the genome sequences directly. The sequences, or portions thereof, may be used 
5 to discover antisense inhibitors of virus replication and novel therapeutics based on 
antisense mechanisms. The sequences, or portions thereof, may be used to prepare 
novel gene therapy vectors. The sequences or portions thereof may be used as a basis 
for the generation of DNA- or RNA-containing oligonucleotides designed to form a 
triplex with duplex DNA, for use as analytical tools, diagnostics or therapeutics. 

10 Nucleic acid sequences, or portion thereof, can be used to generate cell lines useful for 
diagnostics or screening. The DNA sequences can be used to predict restriction enzyme 
sites useful for replacing the gene in the viral genome with a marker gene such as lac z 
or green flourescent protein. Such a replacement is useful in defining the biological role 
of the gene in the viral life cycle. These gene knockout experiments are useful to 

15 discover genes which are likely to be high quality drug discovery targets (essential 
genes) or good locations for foreign genes for the purposes of gene therapy (non- 
essential genes) through an HSV-2 viral vector. Such gene replacements are also useful 
for discovering virulence factors, for example by comparing the pathogenicity of the 
modified virus with the unmodified virus or through the ease of identifying a marker 

20 gene such as iacz. 

In addition to the standard A, G, C, T/U representations for nucleic acid bases, the 
term "N" is also used. "N" means that any of the four DNA or RNA bases may appear at 
such a designated position in the DNA or RNA sequence, except it is preferred that N is not 
a base that when taken in combination with adjacent nucleotide positions, when read in the 

25 correct reading frame, would have the effect of generating a premature termination codon in 
such reading frame. 

In sum, a polynucleotide of the invention may encode a mature protein, a mature protein 
plus a leader sequence (which may be referred to as a preprotein), a precursor of a mature protein 
having one or more prosequences that are not the leader sequences of a preprotein, or a 
30 preproprotein, which is a precursor to a proprotein, having a leader sequence and one or more 
prosequences, which generally are removed during processing steps that produce active and 
mature forms of the polypeptide. 

Polypeptides 

The present invention further relates to HSV-2 polypeptides that have the deduced 
35 amino acid sequences of the polypeptides defined by amino acid sequence in Tables 1-4. 
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The invention also relates to fragments, analogs and derivatives of these polypeptides. 
Tlie terms "fragment," "derivative" and "analog" when referring to the polypeptides of the 
invention mean a polypeptide which retains essentially the same biological function or activity 
as such polypeptide. Fragments, derivatives and analogs that retain at least 90% of the 
5 biological activity of the native HSV-2 protein are preferred. Fragments, derivatives and analogs 
that retain at least 95% of the activity of the native HSV-2 protein are preferred. Thus, an analog 
includes a proprotein which can be activated by cleavage of the proprotein portion to produce an 
active mature polypeptide. 

The polypeptide of the present invention may be a recombinant polypeptide, a natural 
1 0 polypeptide or a synthetic polypeptide. In certain preferred embodiments it is a recombinant 
polypeptide. 

The fragment, derivative or analog of the polypeptides of the invention may be (i) one 
in which one or more of the amino acid residues are substituted with a conserved or non- 
conserved amino acid residue (preferably a conserved amino acid residue) and such substituted 

15 amino acid residue may or may not be one encoded by the genetic code, or (ii) one in which one 
or more of the amino acid residues includes a substituent group, or (iii) one in which the mature 
polypeptide is fused with another compound, such as a compound to increase the half-life of the 
polypeptide (for example, polyethylene glycol), or (iv) one in which the additional amino acids 
are fused to the mature polypeptide, such as a leader or secretory sequence or a sequence which is 

20 employed for purification of the mature polypeptide or a proprotein sequence. Such fragments, 
derivatives and analogs are deemed to be obtained by those of ordinary skill in the art, from the 
teachings herein. 

Among preferred variants are those that vary from a reference by conservative amino 
acid substitutions. Such substitutions are those that substitute a given amino acid in a 

25 polypeptide by another amino acid of like characteristics. Typically seen as conservative 

substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, 
Leu and De; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues 
Asp and Glu, substitution between the amide residues Asn and Gin, exchange of the basic 
residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

30 Further particularly preferred in this regard are variants, analogs, derivatives and 

fragments having the amino acid sequence of one or more of the HSV-2 polypeptides of the 
invention, in which several, a few, 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues are 
substituted, deleted or added, in any combination. Especially preferred among these are silent 
substitutions, additions and deletions, which do not alter the properties and activities of the HSV- 
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2 protein. Also especially preferred in this regard are conservative substitutions. Most highly 
preferred are polypeptides having the amino acid sequences of Tables 1-4 without substitutions. 
The invention also includes polypeptides of the formula: 
X-(R 1 ) n -(R 2 HR 3 ) m >Y 

5 wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or a 
metal, Rj and R3 are any amino acid residue, n and/or m is an integer between 1 and 2000 or 
zero, and R 2 is an amino acid sequence of the invention, particularly an amino acid sequence 
selected from the group set forth in Tables 1, 2, 3 and 4. In the formula above R 2 is oriented so 
that its amino terminal residue is at the left, bound to Rj and its carboxy terminal residue is at 

10 the right, bound to R3. Any stretch of amino acid residues denoted by either R group, where n 
and/or m is greater than 1, may be either a heteropolymef or a homopolymer, preferably a 
heteropolymer. In preferred embodiments n and/or m is an integer between 1 and 1000 or 2000. 

The polypeptides and polynucleotides of the present invention are preferably provided in 
an isolated form, and preferably are purified to homogeneity. 

15 The polypeptides of the present invention include the polypeptides of Tables 1-4, in 

particular the mature polypeptide as well as polypeptides which have at least 60%, 70% or 80% 
identity to one or more of the polypeptides of Tables 1-4 and preferably at least 90% similarity to 
one or more of the polypeptides of Tables 1-4 and more preferably at least 95% similarity; and 
still more preferably at least 95% identity to one or more of the polypeptides of Tables 1-4 and 

20 also include portions of such polypeptides with such portion of the polypeptide generally 

containing at least 30 contiguous amino acids and more preferably at least 50 contiguous amino 
acids. 

In addition to the uses mentioned above for the polypeptides of this invention, the 
following applications are also contemplated by this invention. Inter alia, the polypeptides 

25 disclosed herein or portions thereof which have enzymatic activity or structural functionality 
are useful as a source of those proteins for screening and or therapy. Such polypeptides 
may be identified by homology for example, to HSV1 polypeptides that code for proteins 
with known function (e.g., helicases, kinases, proteases). Use of polypeptides of the 
invention for screening or therapy based upon functionality predicted by homology match is 

30 a particularly preferred aspect of this invention. Also the polypeptides derived from the 
deposited strain ATCC VR-2546 herein can be used for comparison with sequences from 
other HSV-2 strains in the public domain, for example, comparison of the polypeptides of 
the invention with strain HG52 may be useful in the discovery of virulence factors, since 
HG52 is avirulent in mouse and guinea pig infection models and HSV-2 SB5 is virulent. 

35 Similarly, public domain homolog from strain MS may be useful in the discovery of 
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virulence factors since there are major differences in the CNS pathogenesis in animal 
models between strains MS and SB5. 

In addition to the standard single and triple letter representations for amino acids, 
the term "X" or "Xaa" is also used. "X" and "Xaa" mean that any of the twenty naturally 
5 occuring amino acids may appear at such a designated position in the polypeptide sequence. 

Fragments 

Fragments or portions of the polypeptides of the present invention may be employed for 
producing the corresponding full-length polypeptide by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length polypeptides. 

10 Fragments or portions of the polynucleotides of the present invention may be used to synthesize 
full-length polynucleotides of the present invention. 

Also among preferred embodiments of this aspect of the present invention are 
polypeptides comprising fragments of HS V-2, most particularly fragments of HS V-2 having the 
amino acid sequences set out in Tables 1-4, and variants and derivatives thereof. 

15 In this regard, a fragment is a polypeptide having an amino acid sequence that entirely is 

the same as part but not all of the amino acid sequence of the aforementioned HS V-2 
polypeptides and variants or derivatives thereof. 

Such fragments may be "free-standing," i.e., not part of or fused to other amino acids or 
polypeptides, or they may be comprised within a larger polypeptide of which they form a part or 

20 region. When comprised within a larger polypeptide, the presently discussed fragments most 
preferably form a single continuous region. However, several fragments may be comprised 
within a single larger polypeptide. For instance, certain preferred embodiments relate to a 
fragment of a HS V-2 polypeptide of the present comprised within a precursor polypeptide 
designed for expression in a host and having heterologous pre and pro-polypeptide regions fused 

25 to the amino terminus of the HS V-2 fragment and an additional region fused to the carboxyl 
terminus of the fragment. Therefore, fragments in one aspect of the meaning intended herein, 
refers to the portion or portions of a fusion polypeptide or fusion protein derived from HS V-2. 

Representative examples of polypeptide fragments of the invention, include, for 
example, those which have from about 5-15, 10-20, 15-40, 30-55, 41-75, 41-80, 41-90, 50-100, 

30 75-100,90-115, 100-125, and 110-140, 120-150, 200-300, 1-175, 1-600 or 1-1000 amino acids 
long. Particular examples of polypeptide fragments of the inventions that may be mentioned 
include fragments of 20-200 amino acids. 

In this context about includes the particularly recited range and ranges larger or smaller 
by several, a few, 5, 4, 3, 2 or 1 amino acid at either extreme or at both extremes. 
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Among especially preferred fragments of the invention are truncation mutants of HSV-2. 
Truncation mutants include HSV-2 polypeptides having the amino acid sequences of Tables 1-4, 
or of variants or derivatives thereof, except for deletion of a continuous series of residues (that is, 
a continuous region, part or portion) that includes the amino terminus, or a continuous series of 
5 residues that includes the carboxyl terminus or, as in double truncation mutants, deletion of two 
continuous series of residues, one including the amino terminus and one including the carboxyl 
terminus. Fragments having the size ranges set out above also are preferred embodiments of 
truncation fragments, which are especially preferred among fragments generally. Degradation 
forms of the polypeptides of the invention in a host cell are also preferred. 

10 Also preferred in this aspect of the invention are fragments characterized by structural or 

functional attributes of HSV-2. Preferred embodiments of the invention in this regard include 
fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta- 
sheet and beta-sheet-forming regions ("beta-regions"), turn and turn-forming regions ("turn- 
regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic 

1 5 regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming 
regions and high antigenic index regions of HSV-2. 

Further preferred regions are those that mediate activities of HSV-2. Most highly 
preferred in this regard are fragments that have a chemical, biological or other activity of the 
particular HSV-2 protein , including those with a similar activity or an improved activity, or with 

20 a decreased undesirable activity. Routinely one generates the fragment by well-known methods 
then compares the activity of the fragment to the native protein in a convenient assay such as 
listed hereinbelow. Highly preferred in this regard are fragments that contain regions that are 
homologs in sequence, or in position, or in both sequence and to active regions of related 
polypeptides, such as the related polypeptides set out in Table 1. Among particularly preferred 

25 fragments in these regards are truncation mutants, as discussed above. Further preferred 

polynucleotide fragments are those that are antigenic or immunogenic in an animal, especially in 
a human. 

It will be appreciated that the invention also relates to, among others, polynucleotides 
encoding the aforementioned fragments, polynucleotides that hybridize to polynucleotides 
30 encoding the fragments, particularly those that hybridize under stringent conditions, and 

polynucleotides, such as PCR primers, for amplifying polynucleotides that encode the fragments. 
In these regards, preferred polynucleotides are those that correspond to the preferred fragments, 
as discussed above. 
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Vectors, host cells, expression: 

The present invention also relates to vectors which comprise a polynucleotide or 
polynucleotides of the present invention, host cells which are genetically engineered with vectors 
of the invention and the production of polypeptides of the invention by recombinant techniques, 
5 Host cells can be genetically engineered to incorporate polynucleotides and express 

polypeptides of the present invention. Introduction of a polynucleotides into the host cell can be 
affected by calcium phosphate transfection, DEAE-dextran mediated transfection, transvection, 
microinjection, cationic lipid-mediated transfection, electroporation, transduction, scrape loading, 
ballistic introduction, infection or other methods. Such methods are described in many standard 

10 laboratory manuals, such as Davis e| al, BASIC METHODS IN MOLECULAR BIOLOGY, 

(1986) and Sambrook et aL, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). 

Polynucelotide constructs in host cells can be used in a conventional manner to produce 
the gene product encoded by the recombinant sequence. Alternatively, the polypeptides of the 

15 invention can be synthetically produced by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be employed 
to produce such proteins using RNAs derived from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are 

20 described by Sambrook et aL, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). 

In accordance with this aspect of the invention the vector may be, for example, a 
plasmid vector, a single or double-stranded phage vector, a single or double-stranded RNA or 
DNA viral vector. Plasmids generally are designated herein by a lower case p preceded and/or 

25 followed by capital letters and/or numbers, in accordance with standard naming conventions that 
are familiar to those of skill in the art. Starting plasmids disclosed herein are either commercially 
available, publicly available, or can be constructed from available plasmids by routine 
application of well known, published procedures. Many plasmids and other cloning and 
expression vectors that can be used in accordance with the present invention are well known and 

30 readily available to those of skill in the art. 

Preferred among vectors, in certain respects, are those for expression of polynucleotides 
and polypeptides of the present invention. Generally, such vectors comprise m-acting control 
regions effective for expression in a host operatively linked to the polynucleotide to be expressed. 
Appropriate fraLr-acting factors either are supplied by the host, supplied by a complementing 

35 vector or supplied by the vector itself upon introduction into the host. 
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In certain preferred embodiments in this regard, the vectors provide for specific 
expression. Such specific expression may be inducible expression or expression only in certain 
types of cells or both inducible and cell-specific. Particularly preferred among inducible vectors 
are vectors that can be induced for expression by environmental factors that are easy to 
5 manipulate, such as temperature and nutrient additives. A variety of vectors suitable to this 
aspect of the invention, including constitutive and inducible expression vectors for use in 
prokaryotic and eukaryotic hosts, are well known and employed routinely by those of skill in the 
art. 

A great variety of expression vectors can be used to express a polypeptide of the 

10 invention. Such vectors include, among others, chromosomal, episomal and virus-derived 

vectors, e.g., vectors derived from viral plasmids, from bacteriophage, from transposons, from 
yeast episomes, from insertion elements, from yeast chromosomal elements, from viruses such as 
baculoviruses, papova viruses, such as SV40, vaccinia viruses, adenoviruses, fowl pox viruses, 
pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as 

15 those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids, 
all may be used for expression in accordance with this aspect of the present invention. Generally, 
any vector suitable to maintain, propagate or express polynucleotides to express a polypeptide in 
a host may be used for expression in this regard. 

The appropriate DNA sequence may be inserted into the vector by any of a variety of 

20 well-known and routine techniques, such as, for example, those set forth in Sambrook et ai, 
MOLECULAR CLONING, A LABORATORY MANUAL, 2nd Ed.; Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York (1989). 

The DNA sequence in the expression vector is operatively linked to appropriate 
expression control sequence(s), including, for instance, a promoter to direct mRNA transcription. 

25 Representatives of such promoters include, but are not limited to, the phage lambda PL 

promoter, the £ coli lac, trp and tac promoters, the SV40 early and late promoters and promoters 
of retroviral LTRs. 

In general, expression constructs will contain sites for transcription initiation and 
termination, and, in some instances, in the transcribed region, a ribosome binding site for 
30 translation. The coding portion of the mature transcripts expressed by the constructs will include 
a translation initiating AUG at the beginning and a termination codon appropriately positioned at 
the end of the polypeptide to be translated. 

In addition, the constructs may contain control regions that regulate as well as engender 
expression. Generally, in accordance with many commonly practiced procedures, such regions 
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will operate by controlling transcription, such as transcription factors, repressor binding sites and 
termination, among others. 

Vectors for propagation and expression generally will include selectable markers and 
amplification regions, such as, for example, those set forth in Sambrook ei al y MOLECULAR 
5 CLONING, A LABORATORY MANUAL, 2nd Ed.; Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, New York (1989). 

Representative examples of appropriate hosts include bacterial cells, such as 
streptococci, staphylococci, E. coli, streptomyces and Bacillus subtilis cells; fungal cells, such as 
yeast cells and Aspergillus cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; 
10 animal cells such as CHO, COS, HeLa, C127, 3T3, BHK, 293 and Bowes melanoma cells; and 
plant cells. 

The following vectors, which are commercially available, are provided by way of 
example. Among vectors preferred for use in bacteria are pQE70, pQE60 and pQE-9, available 
from Qiagen; pBS vectors, Phagescript vectors, Bluescript vectors, pNH8A, pNH16a, pNH18A, 

1 5 pNH46A, available from Stratagene; and ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 

available from Pharmacia, and pBR322 (ATCC 37017). Among preferred eukaryotic vectors are 
pWLNEO, pS V2CAT, pOG44, pXTl and pSG available from Stratagene; and pSVK3, pBPV, 
pMSG and pSVL available from Pharmacia. These vectors are listed solely by way of 
illustration of the many commercially available and well known vectors that are available to 

20 those of skill in the art for use in accordance with this aspect of the present invention. It will be 
appreciated that any other plasmid or vector suitable for, for example, introduction, maintenance, 
propagation or expression of a polynucleotide or polypeptide of the invention in a host may be 
used in this aspect of the invention. 

Promoter regions can be selected from any desired gene using vectors that contain a 

25 reporter transcription unit lacking a promoter region, such as a chloramphenicol acetyl transferase 
("CAT') transcription unit, downstream of restriction site or sites for introducing a candidate 
promoter fragment; i.e, a fragment that may contain a promoter. As is well known, introduction 
into the vector of a promoter-containing fragment at the restriction site upstream of the cat gene 
engenders production of CAT activity, which can be detected by standard CAT assays. Vectors 

30 suitable to this end are well known and readily available, such as pKK232-8 and pCM7. 
Promoters for expression of polynucleotides of the present invention include not only well 
known and readily available promoters, but also promoters that readily may be obtained by the 
foregoing technique, using a reporter gene. 

Among known prokaryotic promoters suitable for expression of polynucleotides and 

35 polypeptides in accordance with the present invention are the E. coli lacl and lacZ and 
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promoters, the T3 and T7 promoters, the gpt promoter, the lambda PR, PL promoters and the trp 
promoter. 

Among known eukaryotic promoters suitable in this regard are the CMV immediate 
early promoter, the HSV thymidine kinase promoter, the early and late SV40 promoters, the 
5 promoters of retroviral LTRs, such as those of the Rous sarcoma virus ("RSV"), and 
metallothionein promoters, such as the mouse metallothionein-I promoter. 

Recombinant expression vectors will include, for example, origins of replication, a 
promoter preferably derived from a highly-expressed or regulatable gene to direct transcription of 
a downstream structural sequence, and a selectable marker to permit isolation of vector 

1 0 containing cells after exposure to the vector. 

Polynucleotides of the invention, encoding the heterologous structural sequence of a 
polypeptide of the invention generally will be inserted into the vector using standard techniques 
so that it is operably linked to the promoter for expression. The polynucleotide will be 
positioned so that the transcription start site is located appropriately 5' to the AUG that initiates 

1 5 translation of the polypeptide to be expressed. Where applicable, a ribosome binding site may be 
located between the transcription start site and the initiating AUG. Generally, there will be no 
other open reading frames that begin with an initiation codon, usually AUG, and lie between the 
ribosome binding site, where applicable or the 5' end of the transcript and the initiation codon. 
Also, generally, there will be a translation stop codon at the end of the polypeptide and there will 

20 be a polyadenylation signal in constructs for use in eukaryotic hosts. Transcription termination 
signal appropriately disposed at the 3* end of the transcribed region may also be included in the 
polynucleotide construct. 

For secretion of the translated protein into the lumen of the endoplasmic reticulum, into 
the periplasmic space or into the extracellular environment, appropriate secretion signals may be 

25 incorporated into the expressed polypeptide. These signals may be endogenous to the 
polypeptide or they may be heterologous signals. 

The polypeptide may be expressed in a modified form, such as a fusion protein, and may 
include not only secretion signals but also additional heterologous functional regions. Thus, for 
instance, a region of additional amino acids, particularly charged amino acids, may be added to 

30 the N-or C-terminus of the polypeptide to improve stability and persistence in the host cell, 

during purification or during subsequent handling and storage. Also, a region may be added to 
the polypeptide to facilitate purification. Such regions may be removed prior to final preparation 
of the polypeptide. The addition of peptide moieties to polypeptides to engender secretion or 
excretion, to improve stability or to facilitate purification, among others, are familiar and routine 

35 techniques in the art. A preferred fusion protein comprises a heterologous region from 
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immunoglobulin that is useful to solubilize or purify polypeptides. For example, EP- A-0 464 
533 (Canadian counterpart 2045869) discloses fusion proteins comprising various portions 
of constant region of immunoglobulin molecules together with another protein or part 
thereof In drug discovery, for example, proteins have been fused with antibody Fc portions 
5 for the purpose of high-throughput screening assays to identify antagonists. See, D. Bennett 
et aL, Journal of Molecular Recognition, 8: 52-58 (1995) and K. Johanson et aL, The 
Journal of Biological Chemistry, 270,(16): 9459-9471 (1995). 

Cells typically then are harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 
10 Microbial cells employed in expression of proteins can be disrupted by any convenient 

method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents, such methods are well know to those skilled in the art. 

Mammalian expression vectors may comprise an origin of replication, a suitable 
promoter and enhancer, and also any necessary polyadenylation sites, splice donor and acceptor 
15 sites, transcriptional termination sequences, and 5' flanking non-transcribed sequences that are 
necessary for expression. In certain preferred embodiments in this regard DNA sequences 
derived from the SV40 splice sites, and the SV40 polyadenylation sites are used for required 
non-transcribed genetic elements of these types. 

HSV-2 polypeptides can be recovered and purified from recombinant cell cultures by 
20 well-known methods including ammonium sulfate or ethanol precipitation, acid extraction, anion 
or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Most preferably, high performance liquid chromatography ("HPLC") is 
employed for purification. Well known techniques for refolding protein may be employed to 
25 regenerate active conformation when the polypeptide is denatured during isolation and or 
purification. 

Polypeptides of the present invention include naturally purified products, products of 
chemical synthetic procedures, and products produced by recombinant techniques from a 
prokaryotic or eukaryotic host, including, for example, viral, yeast, higher plant, insect and 
30 mammalian cells. Depending upon the host employed in a recombinant production procedure, 
the polypeptides of the present invention may be glycosylated or may be non-glycosylated. In 
addition, polypeptides of the invention may also include an initial modified methionine residue, 
in some cases as a result of host-mediated processes. 

HSV-2 polynucleotides and polypeptides may be used in accordance with the present 
35 invention for a variety of applications, particularly those that make use of the chemical and 
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biological properties of HSV-2. Additional applications relate to diagnosis and to treatment of 
disorders of cells, tissues and organisms. These aspects of the invention are illustrated further by 
the following discussion. 

Polynucleotide assays: 
5 This invention is also related to the use of the HSV-2 polynucleotides to detect 

complementary polynucleotides such as, for example, as a diagnostic reagent. Detection of HSV- 
2 polynucleotides in a eukaryote, particularly a mammal, and especially a human, will provide a 
diagnostic method that can add to, define or allow a diagnosis of a disease. Eukaryotes (herein 
also "individual(s)"), particularly mammals, and especially humans, infected by HSV-2 may be 

10 detected at the DNA or RNA level by a variety of techniques. Nucleic acids for diagnosis may 
be obtained from an individual's cells, tissues, and fluids, such as brain, bone, blood, muscle, 
cartilage, skin, saliva, urine, semen, and mucous. Tissue biopsy and autopsy material is also 
preferred for samples from an individual to use in a diagnostic assay. The viral DNA may be 
used directly for detection or may be amplified enzymatically by using PCR prior to analysis 

15 (Saiki el ah, Nature 324: 163-166 (1986)). RNA orcDNA may also be used in the same ways. 
As an example, PCR primers complementary to the nucleic acid encoding HSV-2 can be used to 
identify and analyze HSV-2 presence and expression. Using PCR, characterization of the strain 
of virus present in a eukaryote, particularly a mammal, and especially a human, may be made by 
an analysis of the genotype of the viral gene. For example, deletions and insertions can be 

20 detected by a change in size of the amplified product in comparison to the genotype of a 
reference sequence. Point mutations can be identified by hybridizing amplified DNA to 
radiolabeled HSV-2 RNA or alternatively, radiolabeled HSV-2 antisense DNA sequences. 
Perfectly matched sequences can be distinguished from mismatched duplexes by RNase A 
digestion or by differences in melting temperatures. 

25 Sequence differences between a reference gene and genes having mutations also may be 

revealed by direct DNA sequencing. In addition, cloned DNA segments may be employed as 
probes to detect specific DNA segments. The sensitivity of such methods can be greatly 
enhanced by appropriate use of PCR or another amplification method. For example, a 
sequencing primer is used with double-stranded PCR product or a single-stranded template 

30 molecule generated by a modified PCR. The sequence determination is performed by 

conventional procedures with radiolabeled nucleotide or by automatic sequencing procedures 
with fluorescent-tags. 

Genetic typing of various strains of virus based on DNA sequence differences may be 
achieved by detection of alteration in electrophoretic mobility of DNA fragments in gels, with or 
35 without denaturing agents. Small sequence deletions and insertions can be visualized by high 
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resolution gel electrophoresis. DNA fragments of different sequences may be distinguished on 
denaturing formamide gradient gels in which the mobilities of different DNA fragments are 
retarded in the gel at different positions according to their specific melting or partial melting 
temperatures (see, e.g, 9 Myers ej aL, Science . 230: 1242 (1985)). 
5 Sequence changes at specific locations also may be revealed by nuclease protection 

assays, such as RNase and S 1 protection or the chemical cleavage method (e.g., Cotton et aL, 
Proc, Natl Agaj t gci., USA, 85: 4397-4401 (1985)). 

Thus, the detection of a specific DNA sequence may be achieved by methods such as 
hybridization, RNase protection, chemical cleavage, direct DNA sequencing or the use of 

10 restriction enzymes, (e.g., restriction fragment length polymorphisms ("RFLP") and Southern 
blotting of genomic DNA. 

In addition to more conventional gel-electrophoresis and DNA sequencing, mutations 
also can be detected by in situ analysis. 

Cells carrying mutations or polymorphisms in the gene of the present invention may 

1 5 also be detected at the DNA level by a variety of techniques, to allow for serotyping, for 
example. Nucleic acids for diagnosis may be obtained from an infected individual's cells, 
including but not limited to blood, urine, saliva, tissue biopsy and autopsy material or from virus 
isolated and cultured from the above or other sources. The viral DNA may be used directly for 
detection or may be amplified enzymatically by using PCR (Saiki et ai f Nature, 324: 1 63- 1 66 

20 (1986)) prior to analysis. RT-PCR can also be used to detect mutations. It is particularly 
preferred to used RT-PCR in conjunction with automated detection systems, such as, for 
example, GeneScan. RNA or cDNA may also be used for the same purpose, PCR or RT-PCR. 
As an example, PCR primers complementary to the nucleic acid encoding HS V-2 can be used to 
identify and analyze mutations. For example, deletions and insertions can be detected by a 

25 change in size of the amplified product in comparison to the normal genotype. Point mutations 
can be identified by hybridizing amplified DNA to radiolabeled RNA or alternatively, 
radiolabeled antisense DNA sequences. Perfectly matched sequences can be distinguished from 
mismatched duplexes by RNase A digestion or by differences in melting temperatures. The 
primers may be used to amplify the gene isolated from the individual such that the gene may 

30 then be subject to various techniques for elucidation of the DNA sequence. In this way, 
mutations in the DNA sequence may be detected. 
Polypeptide assays: 

The present invention also relates to diagnostic assays such as quantitative and 
diagnostic assays for detecting levels of HSV-2 protein in cells and tissues, including 
35 determination of normal and abnormal levels. Thus, for instance, a diagnostic assay in 
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accordance with the invention for detecting expression of HSV-2 protein compared to normal 
control tissue samples may be used to detect the presence of an infection. Assay techniques that 
can be used to detennine levels of a protein; such as an HSV-2 protein of the present invention, 
in a sample derived from a host are well-known to those of skill in the art. Such assay methods 
5 include radioimmunoassays, competitive-binding assays, Western Blot analysis and ELISA 
assays. Among these ELISAs frequently are preferred. An ELISA assay initially comprises 
preparing an antibody specific to HSV-2, preferably a monoclonal antibody. In addition a 
reporter antibody generally is prepared which binds to the monoclonal antibody. The reporter 
antibody is attached to a detectable reagent such as radioactive, fluorescent or enzymatic reagent, 
10 in this example horseradish peroxidase enzyme. 
Antibodies; 

The polypeptides, their fragments or other derivatives, or analogs thereof, or cells 
expressing them can be used as an immunogen to produce antibodies thereto. The present 
invention includes, for examples monoclonal and polyclonal antibodies, chimeric, single chain, 

1 5 and humanized antibodies, as well as Fab fragments, or the product of an Fab expression library. 
Antibodies generated against the polypeptides corresponding to a sequence of the 
present invention can be obtained by direct injection of the polypeptides into an animal or by 
administering the polypeptides to an animal, preferably a nonhuman. The antibody so obtained 
will then bind the polypeptides itself. In this manner, even a sequence encoding only a fragment 

20 of the polypeptides can be used to generate antibodies binding the whole native polypeptides. 
Such antibodies can then be used to isolate the polypeptide from tissue expressing that 
polypeptide. 

For preparation of monoclonal antibodies, any technique known in the art which 

provides antibodies produced by continuous cell line cultures can be used. Examples include 
25 various techniques, such as those in Kohler, G. and Milstein, G, Nature 256: 495-497 (1975); 

Kozbor et ah, Immunology Today 4: 72 (1983); Cole et aL pg. 77-96 in MONOCLONAL 

ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. (1985). 

Techniques described for the production of single chain antibodies (U.S. Patent No. 

4,946,778) can be adapted to produce single chain antibodies to immunogenic polypeptide 
30 products of this invention. Also, transgenic mice, or other organisms such as other mammals, 

may be used to express humanized antibodies to immunogenic polypeptide products of this 

invention. 

Alternatively phage display technology could be utilized to select antibody genes 
with binding activities towards the polypeptide either from repertoires of PCR amplified v- 
35 genes of lymphocytes from humans screened for possessing anti-Fbp or from naive libraries 
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(McCafferty, J. el aL . Nature 348, 552-554 (1990); Marks, J. eiaL, Biotechnology 10: 
779-783 (1992). The affinity of these antibodies can also be improved by chain shuffling 
(Clackson, T. etaL Nature 352, 624-628 (1991). 

If two antigen binding domains are present each domain may be directed against a 
5 different epitope - termed 'bispecific' antibodies. 

The above-described antibodies may be employed to isolate or to identify clones 
expressing the polypeptide or purify the polypeptide of the present invention by attachment of 
the antibody to a solid support for isolation and/or purification by affinity chromatography. 

Thus among others, antibodies against HS V-2 may be employed to inhibit and/or treat 
10 infections, particularly viral infections, and especially HSV-2 infections as well as to monitor the 
effectiveness of antibiotic treatment. 

Polypeptide derivatives include antigenically, epitopically or immunologically 
equivalent derivatives which form a particular aspect of this invention. The term 
"antigenically equivalent derivative" as used herein encompasses a polypeptide or its 
15 equivalent which will be specifically recognized by certain antibodies which, when raised 
to the protein or polypeptide according to the present invention, interfere with the 
immediate physical interaction between pathogen and mammalian host. The term 
"immunologically equivalent derivative" as used herein encompasses a peptide or its 
equivalent which when used in a suitable formulation to raise antibodies in a vertebrate, the 
20 antibodies act to interfere with the immediate physical interaction between pathogen and 
mammalian host. 

The polypeptide, such as an antigenically or immunologically equivalent derivative 
or a fusion protein thereof, is used as an antigen to immunize a mouse or other animal such 
as a rabbit, rat or chicken. The fusion protein may provide stability to the polypeptide. The 

25 antigen may be associated, for example by conjugation , with an immunogenic carrier 

protein for example bovine serum albumin (BSA) or keyhole limpet haemocyanin (KLH). 
Alternatively a multiple antigenic peptide comprising multiple copies of the protein or 
polypeptide, or an antigenically or immunologically equivalent polypeptide thereof may be 
sufficiently antigenic to improve immunogenicity so as to obviate the use of a carrier. 

30 Preferably the antibody or derivative thereof is modified to make it less 

immunogenic in the individual. For example, if the individual is human the antibody may 
most preferably be "humanised" ; where the complimentarity determining region(s) of the 
hybridoma-derived antibody has been transplanted into a human monoclonal antibody , for 
example as described in Jones, P. et aL Nature 321 : 522-525 (1986)or Tempest et aL, 

35 Biotechnolog y 9: 266-273 (1991). The above antibody reagents will also be useful for 
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assessing the biological role of the gene through antibody inhibition studies, 
immunoprecipitation studies, super-shift experiments and similar techniques. These studies 
may lead to discovery of novel proteinrprotein interactions which may be useful drug 
targets. The above antibody reagents may lead to the identification of novel viral proteins 
5 not predicted by the DNA sequence, which in turn may be novel drug targets. 
HSV-2 binding molecules and assays; 

This invention also provides a method for identification of molecules, such as binding 
molecules, that bind HSV-2. Genes encoding proteins that bind HSV-2, such as binding 
proteins, can be identified by numerous methods known to those of skill in the art, for example, 

10 ligand panning and FACS sorting. Such methods are described in many laboratory manuals 
such as, for instance, Coligan ei sL Current Protocols in Immunology 1(2): Chapter 5 ( 1 99 1 ). 

For instance, expression cloning may be employed for this purpose. To this end 
polyadenylated RNA is prepared from a cell expressing HSV-2, a cDNA library is created from 
this RNA, the library is divided into pools and the pools are transfected individually into cells 

1 5 that are not expressing HSV-2. The transfected cells then are exposed to labeled HSV-2. HSV-2 
can be labeled by a variety of well-known techniques including standard methods of radio- 
iodination or inclusion of a recognition site for a site-specific protein kinase. Following 
exposure, the cells are fixed and binding of HSV-2 is determined. These procedures 
conveniently are carried out on glass slides. 

20 Alternatively a labeled ligand can be photoaffinity linked to a cell extract, such as a 

membrane or a membrane extract, prepared from cells that express a molecule that it binds, such 
as a binding molecule. Cross-linked material is resolved by polyacrylamide gel electrophoresis 
("PAGE") and exposed to X-ray film. The labeled complex containing the ligand-binding can be 
excised, resolved into peptide fragments, and subjected to protein microsequencing. The amino 

25 acid sequence obtained from microsequencing can be used to design unique or degenerate 

oligonucleotide probes to screen cDNA libraries to identify genes encoding the putative binding 
molecule. 

Polypeptides of the invention also can be used to assess HSV-2 binding capacity of 
HSV-2 binding molecules in cells or in cell-free preparations. 
30 Polypeptides of the invention may also be used to assess the binding of small molecule 

substrates and ligands in, for example, cells, cell-free preparations, chemical libraries, and natural 
product mixtures. These substrates and ligands may be natural substrates and ligands or may be 
structural or functional mimetics. 

This invention also provides a method of screening drugs to identify those which 
35 interfere with the proteins selected as targets herein, which method comprises measuring the 
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interference of the activity of the protein by a test drug. For example if the protein selected 
has a catalytic activity, after suitable purification and formulation the activity of the enzyme 
can be followed by its ability to convert its natural substrates. By incorporating different 
chemically synthesised test compounds or natural products into such an assay of enzymatic 
5 activity one is able to detect those additives which compete with the natural substrate or 
otherwise inhibit enzymatic activity. 

The invention also relates to activators and inhibitors identified thereby. 
Another aspect of the invention relates to use of a polynucleotide in genetic 
immunization, and will preferably employ a suitable delivery method such as direct 

10 injection of plasmid DNA into muscles (Wolff et a/., Hum. MoL Genet. 1:363 (1992); 
Manthorpe et al. t Hum. Gene Ther . 4:419 (1963)), delivery of DNA complexed with 
specific protein carriers ( Wu et al, J. Biol. Chem . 264:16985 (1989)), coprecipitation of 
DNA with calcium phosphate (Benvenisty & Reshef, Proc. Natl Acad. ScL USA . 83:955 1 
(1986)), encapsulation of DNA in various forms of liposomes (Kaneda et a/., Science 

1 5 243:375 (1989)), particle bombardment (Tang et aL, Nature 356: 152 (1992)); Eisenbraun et 
ai, DNA Cell Biol. 12:791 (1993)) and in vivo infection using cloned retroviral vectors 
(Seeger et a/., Proc. Natl Acad. Sci. USA 81:5849 (1984)). Suitable promoters for muscle 
transfection include CMV, RSV, SRa, actin, MCK, alpha globin, adenovirus and 
dihydrofolate reductase. 

20 In therapy or as a prophylactic, the active agent i.e., the polypeptide, polynucleotide 

or inhibitor of the invention, may be administered to a patient as an injectable composition, 
for example as a sterile aqueous dispersion, preferably isotonic. 
Vaccines: 

Another aspect of the invention relates to a method for inducing an immunological 
25 response in an individual, particularly a mammal which comprises inoculating the 

individual with HSV-2 polypeptide, or an antigenic fragment or variant thereof, adequate to 

produce antibody to protect said individual from infection, particularly HSV-2 infection. 

Yet another aspect of the invention relates to a method of inducing immunological response 

in an individual which comprises, through gene therapy, delivering a gene encoding HSV-2, 
30 or an antigenic fragment or a variant thereof, for expressing HSV-2, or a fragment or a 

variant thereof in vivo in order to induce an immunological response to produce antibody to 

protect said individual from disease. 

A further aspect of the invention relates to an immunological composition which, 

when introduced into a host capable or having induced within it an immunological 
35 response, induces an immunological response in such host to HSV-2 or a protein coded 
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therefrom, wherein the composition comprises a recombinant HS V-2 or protein coded 
therefrom comprising DNA which codes for and expresses an antigen of said HSV-2 or 
protein coded therefrom. 

The HSV-2 or a fragment thereof may be fused with a co-protein which may not by 
5 itself produce antibodies, but is capable of stabilizing the first protein and producing a fused 
protein which will have immunogenic and protective properties. This fused recombinant 
protein, preferably further comprises an antigenic co-protein, such as Glutathione-S- 
transferase (GST) or beta-galactosidase, relatively large co-proteins which solubilise the 
protein and facilitate production and purification thereof. Moreover, the co-protein may act 

10 as an adjuvant in the sense of providing a generalized stimulation of the immune system. 
The co-protein may be attached to either the amino or carboxy terminus of the first protein. 

The present invention also includes a vaccine formulation which comprises the 
immunogenic recombinant protein together with a suitable carrier. Since the protein may 
be broken down in the stomach, it is preferably administered parenterally, including, for 

15 example, administration that is subcutaneous, intramuscular, intravenous, or intradermal. 
Formulations suitable for parenteral administration include aqueous and non-aqueous sterile 
injection solutions which may contain anti-oxidants, buffers, bacteriostats and solutes 
which render the formulation instonic with the bodily fluid, preferably the blood, of the 
individual; and aqueous and non-aqueous sterile suspensions which may include suspending 

20 agents or thickening agents. The formulations may be presented in unit-dose or multi-dose 
containers, for example, sealed ampoules and vials and may be stored in a freeze-dried 
condition requiring only the addition of the sterile liquid carrier immediately prior to use. 
The vaccine formulation may also include adjuvant systems for enhancing the 
immunogenicity of the formulation, such as oil-in water systems and other systems known 

25 in the art. The dosage will depend on the specific activity of the vaccine and can be readily 
determined by routine experimentation. 

Whilst the invention has been described with reference to acertain HSV-2 
polypeptide, it is to be understood that this covers fragments of the naturally occurring 
protein and similar proteins (for example, having sequence homologies of 75% or greater) 

30 with additions, deletions or substitutions which do not substantially affect the immunogenic 
properties of the recombinant protein. 
Compositions: 

The invention also relates to compositions comprising the polynucleotide or the 
polypeptides discussed above or the inhibitors. Thus, the polypeptides of the present invention 
35 may be employed in combination with a non-sterile or sterile carrier or carriers for use with cells, 
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tissues or organisms, such as a pharmaceutical carrier suitable for administration to a subject. 
Such compositions comprise, for instance, a media additive or a therapeutically effective amount 
of a polypeptide of the invention and a pharmaceutical^ acceptable carrier or excipient. Such 
carriers may include, but are not limited to, saline, buffered saline, dextrose, water, glycerol, 
5 ethanol and combinations thereof. The formulation should suit the mode of administration. 
Kits: 

The invention further relates to diagnostic and pharmaceutical packs and kits comprising 
one or more containers rilled with one or more of the ingredients of the aforementioned 
compositions of the invention. Associated with such containers) can be a notice in the form 
1 0 prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 
or biological products, reflecting approval by the agency of the manufacture, use or sale of the 
product for human administration. 

Administration: 

Polypeptides and other compounds of the present invention may be employed alone or 
1 5 in conjunction with other compounds, such as therapeutic compounds. 

The pharmaceutical compositions may be administered in any effective, convenient 
manner including, for instance, administration by topical, oral, anal, vaginal, intravenous, 
intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes among others. 

The pharmaceutical compositions generally are administered in an amount effective for 
20 treatment or prophylaxis of a specific indication or indications. It will be appreciated that 
optimum dosage will be determined by standard methods for each treatment modality and 
indication, taking into account the indication, its severity, route of administration, complicating 
conditions and the like. 

In therapy or as a prophylactic, the active agent may be administered to an 
25 individual as an injectable composition, for example as a sterile aqueous dispersion, 
preferably isotonic. 

Alternatively the composition may be formulated for topical application 
for example in the form of ointments, creams, lotions, eye ointments, eye drops, ear drops, 
mouthwash, impregnated dressings and sutures and aerosols, and may contain appropriate 
30 conventional additives, including, for example, preservatives, solvents to assist drug 

penetration, and emollients in ointments and creams. Such topical formulations may also 
contain compatible conventional carriers, for example cream or ointment bases, and ethanol 
or oleyl alcohol for lotions. Such carriers may constitute from about 1% to about 98% by 
weight of the formulation; more usually they will constitute up to about 80% by weight of 
35 the formulation. 
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For administration to mammals, and particularly humans, it is expected that the 
daily dosage level of the active agent will be from 0.01 mg/kg to 10 mg/kg, typically 
around 1 mg/kg. The physician in any event will determine the actual dosage which will be 
most suitable for an individual and will vary with the age, weight and response of the 
5 particular individual. The above dosages are exemplary of the average case. There can, of 
course, be individual instances where higher or lower dosage ranges are merited, and such 
are within the scope of this invention. 

The composition of the invention may be administered by injection to achieve a 
systemic effect against relevant virus shortly before insertion of an in-dwelling device. 
10 Treatment may be continued after surgery during the in-body time of the device. In 

addition, the composition could also be used to broaden perioperative cover for any surgical 
technique to prevent viral reactivation. 

Alternatively, the composition of the invention may be used to bathe an indwelling 
device immediately before insertion. 
15 A vaccine composition is conveniently in injectable form. Conventional adjuvants 

may be employed to enhance the immune response. 

A suitable unit dose for vaccination is 0.5-5 microgram/kg of antigen, and such 
dose is preferably administered 1-3 times and with an interval of 1-3 weeks. 

With the indicated dose range, no adverse toxicological effects will be observed 
20 with the compounds of the invention which would preclude their administration to suitable 
individuals. 

In order to facilitate understanding of the following example certain frequently 
occurring methods and/or terms will be described. 

Example 1 

25 Preparation of ultra-purified Herpes simplex 2 virus DNA; 

This protocol describes the preparation of herpes simplex virus type 2 strain SB5 
DNA for sequencing. It is the combination of two protocols, both of which have been 
modified. Part one describes the crude isolation of the viral DNA from host cell DNA 
(Hirt, B., J. Mol. Biol . 26: 365-369. (1967),), and part two describes the ultra-purification 
30 of the viral DNA through a cesium chloride (CsCI) gradient (Vinograd J,et al. v Proc. Nat'l. 
Acad. Sci.(USA) 2:902-910(1963)). 

I. Separation of viral DNA from host DNA (modified from Hirt 1 ) 
Confluent monolayers of Vero cells (ATCC CCL 81) previously seeded into roller 
bottles (1 x 10 8 cells/bottle), were infected with HSV-2 strain SB5 at an MOI = 0.01 in 
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HBSS. After one hour, the vims inndculum was removed and normal media was added 
(DMEM, 10% FCS). 

Approximately 40-48 hours post- infection, infected monolayers were harvested by 
scraping, and placed in 10ml of cold lx PBS. For subsequent steps, three roller bottles of 
5 infected cells were combined (3 x 10** cells) The cells were spun at 2000g x 5 minutes. 
The supernatant was removed and to the cell pellet, 25ml of DNA extraction buffer was 
added (0.25% Triton X-100, lOmM EDTA, lOmM Tris pH 8.0). 

The lysate was mixed at room temperature for 10 minutes. Them to the lysate, 1ml 
of 5M NaCl (0.2M final concentration) was added and allowed to mix another 15 minutes. 
10 The lysate was centrifuged at 10,000g for 30 minutes at 4°C. The supernatant, 

which contains the viral DNA, was saved and the pellet, which contains mostly 
chromosomal DNA, was discarded. 

To the supernatant, SDS was added to 0.5% final cone, and Proteinase K to 
150ug/ml final cone. This was incubated 2 hours at 45°C. 
15 After two hours, 2.5 volumes of 100% ethanol were added. Viral DNA was 

precipitated overnight at -20°C. 

The precipitate was centrifuged at 10,000g for 30 minutes at 4°C. The pellet was 
washed once with 70% ethanol and air dried for 30 minutes. Then the pellet was 
resuspended in 250ul of TE (lOmM Tris, pH 7.5, 2mM EDTA). 
20 RNase A was added to a final concentration of lOug/ml and incubated at 37°C for 

one hour. 

SDS and Proteinase K were then added (as above) and incubated overnight at 37°C. 

The DNA was phenol extracted 2x, chloroform extracted lx, and 1/10 volume 3M 
sodium acetate and 2.5 volumes of 100% ethanol to precipitate were added and allowed to 
25 precipitate overnight at -20°C. The next day, The precipitate was spun down at 15,000g x 
20 minutes. The pellet was washed lx with 70% ethanol, briefly air dried and resuspended 
in 1ml ofTE. 

II. Ultrapurification of the viral DNA through a CsCI gradient (modified from 
Vinograd, et al. supra) 

30 A cesium chloride solution of 57% w/w with the prepared DNA from above was 

made as follows: 

To the 1ml of viral DNA prepared above, 9ml of TE was added for a total of 
exactly 10ml. To this, 13.26g of CsCI was added and dissolved. This solution was added 
to uhracentrufuge tubes and spun in a VTi 40 rotor at 35,000 rpm for 72 hours at 25°C. 
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After centrifugation, the tube was mounted on a gradient collector and through a 
hole pierced in the bottom, 15 drop fractions were collected. 

The refractive index of every fourth tube was determined on a refractometer. The 
viral DNA lies between refractive indicies = 1 .403- 1 .401 . Density range for HSV DNA 

5 from Goldin A.L, el ah, J. Virol . : 50-58. Boyant density (p) = a T| 25 ° - b , where 

coefficients a and b are 10.8601 and 13.4974 respectively forCsCl, Ti = refractive index. 
(Isco tables, a handbook of data for biological and physical scientist, Isco, Inc. Lincoln, NE, 
ninth ed. 1987). 

The appropriate fractions were pooled and dialyzed against 3L of TE with frequent 
10 changing overnight. 

The final DNA prep was concentrated by precipitating with 1/10 volume 3M 
sodium acetate and 2.5 volumes of 100% ethanol. The DNA was resuspended in TE and 
the OD 260/280 reading taken. 

The DNA was then subjected to sequencing as provided in Sambrook, J et al. 
15 (1989) Chapter 13, supra; or by automated DNA sequencing as per manufacturer's 
protocols, e.g., Applied Biosystems/Perkin Elmer, Foster City, CA. 

Certain preferred individual polynucleotide and polypeptide sequences of the 
invention are summarized in the following Tables. Tables 1, 2 and 3 represent three 
different sequencing efforts. Table 4 represents polypeptides encoded by ORFs from Table 
20 3. 

Table 1 provides polynucleotides of the invention and polypeptides encoded by 
ORFs, wherein the polynucleotide start and end position for each ORF is indicated by 
sequence numbers which correlate to the the polynucleotide sequence referred to above each 
given polypeptide in the Table. Additionally, each ORF-encoded polypeptide sequence is 

25 labeled with the Contig number matching the Contig number of the polynucleotide 

sequence from which it was encoded. For ORF sequences wherein the start polynucleotide 
number is larger than the end polynucleotide number, translation of that polypeptide 
initiates on the nucleotide strand which is complemetary to the strand depicted in the Table. 
In many cases there is more than one ORF mapped to an individual Contig. Contig 

30 assembly was performed using the publicly-available Phrap program, P.Green, University 
of Washington, WA., U.S.A. ORF prediction was accomplised using the publicly-available 
GenMark program, Georgia Tech Research Corp., Georgia Tech, GA, U.S.A. Homologies 
of the polypeptide sequences to known proteins are also indicted. These homologies were 



37 



WO 98/20016 



PCT/US97/20016 



determined using the public database Mpsrch_pp, release 2.1 by J.Collins, Biocomputing 
Research Unit, University of Edinburgh (distributed by IntelliGenetics, Inc.). 

Table 2, obtained from a separately-performed sequencing, provides 
polynucleotides of the invention and polypeptides encoded by ORFs, wherein the 

5 polynucleotide start and end position for ORFs are indicated by sequence numbers which 
correlate to the polynucleotide sequence referred to above each given polypeptide in the 
Table. Each ORF-encoded polypeptide sequence is labeled with a Contig number matching 
the Contig number of the polynucleotide sequence referred to above it, from which it was 
encoded. For ORF sequences wherein the nucleotide start number is larger than the end 

0 number, translation of that polypeptide initiates on the nucleotide strand which is 

complementary to the strand depicted in the Table. Contig assembly was accomplished 
using the publicly-available Sequencher 3.0, Gene Codes Corp., Ann Arbor MI, USA, 
software program. ORF prediction was done using the publicly-available GenMark program 
(see Table 1). Homologies of the polypeptide sequences to known proteins are indicated. 

5 These homologies were determined using the publicly-available Mpsrch program (see Table 
1). 

Table 3 obtained from a separately-performed sequencing, provides 
polynucleotides of the invention and polypeptides encoded by ORFs, wherein the 
polynucleotide start and end positions for each ORF is indicated by sequence numbers 

0 which correlate to the polynucleotide sequence referred to above that polypeptide in the 
Table. Each ORF-encoded polypeptide sequence is labeled with a Contig number matching 
the Contig number of the polynucleotide sequence appearing above it from which it was 
encoded. For ORF sequences wherein the start polynucleotide number is larger than the 
end number, translation of that polypeptide initiates on the nucleotide strand which is 

5 complementary to the strand depicted in the Table. Contig assembly was performed using 
the publicly-available Phrap program, (see Table 1). ORF prediction was accomplished 
using the publicly-available GenMark software program (see Table 1). Homologies of the 
polypeptide sequences to known proteins are indicated. These homologies were 
determined by comparison with public database Mpsrch_pp (see Table 1). 

0 Table 4 provides ORF sequences of polypeptides encoded by the polynucleotide 

sequences of Table 3 which were predicted by the GenMark program (see Table 1) as 
having more than a single start site (N-terminal methionyl residue). The Contig numbers 
and polynucleotide start and end sites for these ORFs correlate to the Contig numbers and 
polynucleotide sequence numbers of Table 3. 
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TABLE 1 

5 [SEQ ID NO:l] = Contig ID 100 
[SEQ ID NO:2] 

ORF # = 1 from Contig ID 100 
ORF start site = 1808 
10 ORF end site = 3 
ORF sequence: 

MVLMGRLRNAPESLTYMFCAAIRVAPWTQSRTSLRVCTHVLFPDPALPVMRYAANGNSR 
SGRPVGTSKAATSRNHCRRGTCVTSSCCCESSRMRAMIGWTPCMDVKFKNASSLNRTAGL 
APGCCGGGPGARTSREPSPPDAAMAAQRARAPAMRTRGGDAALCAPEDGWVKVHPTPGTM 

1 5 LFREILLGQMGYTEGQGVYNWRSSEAATRQLQAAI FHALLNATTYRDLEEDWRRHWAR 
GLQPQRLVRRYRNAREGDIAGVAERVFDTWRCTLRTTLLDFAHGWNCFAPGGPSGPTSF 
PKYI DWLTCLGLVP I LRKTREGEATQRLGAFLRQHTLPRQLATVAG AAERAGPGLLELAV 
AFDSTRMAEYDRVHIYYNHRRGEWLVRDPVSGQRGECLVLCPPLWTGDRLVFDSPVQRLC 
PEIVACHALREHAHICRLRNTASVKVLLGRKSDSERGVAGAARVVNKALGEDDETKAGSA 

20 ASCLVRLIINMKGMRHVGDINDTVRAYLDEAGGHLIDTPAVDHTLPGFGKGGTGRGSAAQ 
DPGARPQQLRQAFQTAWNNINGMLEGYINNLFGTIERLRETNAGLATQLQARGGSSRST 
AX 

25 Gene matched: gi | 136794 | sp | P10190 | UL06_HSV11 
Gene name: VIRION PROTEIN UL6 . gi | 73994 



30 [SEQ ID NO:3] 

ORF # = 2 from Contig 100 
ORF start site = 1378 
ORF end site = 4023 
ORF sequence: 

35 MAASGGEGSRDVRAPGPPPQQPGARPAVRFRDEAFLNFTSMHGVQPI IARIRELSQQQLD 
VTQVPRLQWFRDVAALEVPTGLPLREFPFAAYLITGNAGSGKSTCVQTLNE\^DC^ATrcA 
TRIAAQNMYVKLSGAFLSRPINTIFHEFGFRGNHVQAQLGQHPYTLASSPASLEDLQRRD 
LTYYWEVILDITKRALAAHGGEDARNEFHALTALEQTLGLGQGALTRLASVTHGALPAFT 
RSNIIVIDEAGLLGRHLLTTVVYCWWMINALYHTPQYAGRLRPVLVCVGSPTQTASLEST 

40 FEHQKLRC SWQS ENVLTYL ICNRTLREYTRLSHSWAI FINNKRCVEHEFGNLMKVLEYG 
LPITEEHMQFVDRFVVPESYITNPANLPGWTRLFSSHKEVSAYMAKLHAYLKVTREGEFV 
VFTLPVLTFVSVKEFDKYRRLTQQPTLTMEKWITANASRITNYSQSQDQDAGHVRCEVHS 
KQQLWARNDITYVLNSQVAVTARLRKMVFGFDGTFRTFEIAVLRDDSFVKTQGETSVEFA 
YRFLSRLMFGGLIHFYNFLQRPGLDATQRTLAYGRLGELTAELLSLRRDAAGASATRAAD 

45 TSDRS PGERAFNFKHLGPRDGGPDDFPDDDLDVIFAGLDEQQLDVFYCHYALEEPETTAA 
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VHAQFGLLKRAFLGRYLILRELFGEVFESAPFSTYVDNVIFRGCELLTGSPRGGLMSVAL 
QTDNYTLMGYTYTRVFAFAEELRRRHATAGVAEFLEESPLPYIVLRDQHGFMSWNTNIS 
EFVESIDSTELAMAINADYGISSKIJ^TITRSQGLSLDKVAICFTPGNLRLNSAYVAMSR 
TTSSEFLHMNLNPLRERHERDDVISEHILSALRDPNWIVY* 

5 

Gene matched: gi | 74000 |pir | |WMBEU5 

Gene name: gene UL5 protein - human herpesvirus 1 

10 

[SEQ ID NO: 4] 

ORF # = 3 from Contig 100 
ORF start site = 4090 
ORF end site =4695 
15 ORF sequence: 

MGNPQTTIAYSLHHPRASLTSALPDAAQWHVFESGTRAVLTRGRARQDRLPRGGWIQH 
TPIGLLVIIDCRAEFCAYRFIGRASTQRLERWWDAHMYAYPFDSWVSSSHGESVRSATAG 
ILTVVWTPDTIYITATIYGTAPEAARGCDNAPLDVRPTTPPAPVSPTAGEFPANTTDLLV 
EVLREIQISPTLDDADPTPGT* 



Gene matched: gi | 136788 | sp| P28280 |UL04_HSV2H 
Gene name: PROTEIN UL4 . gi | 73890 |pir| |W 

25 [SEQ ID NO: 5 J 

ORF # = 4 from Contig 100 
ORF start site = 5413 
ORF end site = 4895 
ORF sequence: 

30 VGPLDG EPDRDAI S PLTS S VAGDPPG ADG P YVTFDTLFMVSS I DELGRRQLTDTI RKDLR 
LSLAKFSIACTKTSSFSGTAARQRKRGAPPQRTCVPRSNKSLQMFVLCKRANAAQVREQL 
RAVI RSRKPRKYYTRS SDGRLC PAVPVFVHEFVS SEPMRLHRDNVML STEPD* 

35 Gene matched: gi | 330308 

Gene name: (L02638) nuclear phosphoprotein [Herpes simplex v 

[SEQ ID NO: 6 J 

ORF # = 5 from Contig 100 
40 ORF start site = 6656 
ORF end site = 5652 
ORF sequence: 

MKRARSRSPSPPSRPSSPFRTPPHGGSPRREVGAGILASDATSHVCIASHPGSGAGYPTR 
LAAGSAVQRRRPRGCPPGVMFSASTTPEQPLGLSGDATPPLPTSVPLDWAAFRRAFLIDD 
45 AWRPLLEPEIANPLTARLLAEYDRRCQTEEVLPPREDVFSWTRYCTPDDVRWIIGQDPY 
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HHPGQAHGLAFSVRADVPVPPSLRNVIAAVKNCYPDAR^ 
TVKRGAAASHSKI/3WDRFVGGVVRRLAARRPGLVFMLWGAHAQNAIRPDPRQH 
PS PLSKVPFGTCQHFLAANRYLETRDIMPI DWS V* 

5 

Gene matched: gi [330306 

Gene name: (M25410) uracil-DNA glycosylase [Herpes simplex v 



10 

[SEQ ID NO:7] 

ORF # = 6 from Contig 100 
ORF start site = 7080 
ORF end site = 6529 
15 ORF sequence: 

VPCMRTPADDVSWRYEAPS VI DYARI DGI FLRYHC PGLDTFLWDRHAQRAYLVN PFLF AG 
GFLEDLSHSVFPADTQETTTRRALYKEIRDALGSRKQAVSHAPVRAGCVNFDYSRTRRCV 
GRRDLRPANTTSTWEPPVSSDDEASSQSKPLATQPPVLALSNAPPRRVSPTRGRRRHTRL 
RRN* 

20 

Gene matched: gi 1 136776 | sp| P28278 |UL01_HSV2H 
Gene name: GLYCOPROTEIN L PRECURSOR . gi 

25 

(SEQ ID NO: 8] = Contig ID 101 



[SEQ ID NO: 9] 
30 ORF # = 1 from Contig 101 
ORF start site = 351 
ORF end site = 1259 
ORF sequence: 

MIRRRGNVEIRVYYESVRPSRSRSHLKPSDHQEFPGHHVSPGSPGFPESPGNREFHDLPE 
35 NPGSRAYPGTRDPHDPHGCPGSLDPHGNPAQPAGLPSPVPYAPLGSPDPSSPRQRTYVLP 
RVGIRNAPASDTRAPKRAHSRHRADRPPESPGSELYPLNAQALAHLQMLPADHRAFFRTV 
I EVSRLCALNTHDPP PPLAGARVGQEAQLVHTQWLRANRES S PLWPWRTAAMNFI AAAAP 
CVQTHRHMHDLI^ACAFWCCLAHASTCSYAGLYSAHCQHLFRAFGCGPPVLTT 
CN* 

40 

Gene matched: gi | 757866 

Gene name: (X02138) 34K (UslO) (aa 1-284) [Human herpesvirus 
45 [SEQ ID NO: 10] 
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ORF # = 2 from Contig 101 
ORF start site = 2140 
ORF end site = 1871 
ORF sequence: 

5 MTSRPADQDSVRSSASVPLYPAASPVPAEAYYSESEDEAANDFLVRMGRQQSVLRRRRRR 
TRCVGLVI ACLWALLSGGFG ALLVWLLR * 

Gene matched: gi | 135568 j sp| P0 6481 |TEGP_HSV11 
10 Gene name: TEGUMENT PHOSPHOPROTEIN US 9 

[SEQ ID NO: 11] 
ORF # = 3 from Contig 101 
ORF start site = 2377 
15 ORF end site = 2240 
ORF sequence: 

VALHAVDAPS QFVTWLAVRWLRG AVGLGAVLCG I AFYVT S I ARGA * 

20 

Gene matched: gi | 477669 |pir | | B45696 
Gene name: 23-29K immuno reactive epitope dispens 
[SEQ ID NO: 12] 
ORF # = 4 from Contig 101 
25 ORF start site = 3572 
ORF end site = 2529 
ORF sequence: 

VAPPRHHRVIPEVSHTOGVTVimETPEAIl^APGETFETKVSIHAVAHDDGPYAMDWWM 
RFDVPSSCAEMRIYEACLYHPQLPECLSPADAPCAVSSWAYRLAVRSYAGCSRTTPPPRC 
30 FAEARMEPVPGLAWLASTVNLE FQHAS PQHAGLYLCWYVDDH I HAWGHMT I STAAQYRN 
AVVEQHLPQRQPEPVEPTRPHVTlAPPPAPSARGPLRLGAvXGAALLI^^ 
WRRRSWRAVKSRASATGPTYIRVADSELYADWSSDSEGERDGSLWQDPPERPDSPSTNGS 
GFEILSPTAPSVYPHSEGRKSRRPLTTFGSGSPGRRHSQASYSSVLW* 

35 

Gene matched: gi | 138240 | sp| P04488 |VGLE_HSV11 
Gene name: GLYCOPROTEIN E PRECURSOR, gi 

40 [SEQ ID NO: 13] 

ORF # = 5 from Contig 101 
ORF start site = 4176 
ORF end site = 3460 
ORF sequence: 

45 MARGAGLVFFVGVWWSCLAAAPRTSWKRVTSGEDVVLLPAPAGPEERTRAHKLLWAAEP 
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LDACG PLRPS WALWPPRRVTJETWDAACMRAPEPLAI AYS PPF PAGDEGL YS EL AWRDR 
VAWNESLVIYGALETDSGLYTLSWGLSDEARQVASWLWEPAPVPTPTPDDYDEEDD 
AGVSERTPVSVPPPTPPRWSPRGPPEAPSCYPRGVPRARGNGPYGDPGGHYVCPRGDV* 

5 

Gene matched: gi | 138241 | sp| P13289 |VGLE_HSV2 
Gene name: GLYCOPROTEIN E PRECURSOR, gi | 

[SEQ ID NO: 14] 
10 ORF # = 6 from Contig 101 
ORF start site = 5796 
ORF end site = 4495 
ORF sequence: 

VYLWARVGGWLGYLGGTVTTPHKGSLEGGKLGQFIGRERGARTAVPTISHRAHSHLDPSDP 
1 5 GMPGRSLQGLAI LGLWVCATGLWRGPTVSLVSDS LVDAGAVGPQGFVEEDLRVFGELHF 
VGAQVPHTNYYDGIIELFHYPLGNHCPRWHWTLTACPRRPAVAFTLCRSTHHAHSPAY 
PTLELGIARQPLLRVRTATRDYAGLYVLRVWVGSATNASLFVLGVALSANGTFVYNGSDY 
GSCDPAQLPFSAPRLGPSSVYTPGASRPTPPRTTTSPSSPRDPTPAPGDTGTPAPASGER 
APPNSTRSASESRHRLTVAQVIQIAIPASIIAFVFLGSCICFIHRCQRRYRRPRGQIYNP 
20 GGVSCAVNEAAMARLGAELRSHPNTPPKPRRRSSSSTTMPSLTSIAEESEPGPWLLSVS 
PRPRSGPTAPQEV* 

Gene matched: gi | 138328 | sp | P06764 |VGLI_HSV23 
25 Gene name: GLYCOPROTEIN I. gi| 73722 |pir 



[SEQ ID NO: 15] 
30 ORF # = 7 from Contig 101 
ORF start site = 7017 
ORF end site = 5815 
ORF sequence: 

VCIAYHGMGRLTSGVGTAALLWAVGLRWCAKYAIADPSLKMADPNRFRGKNLPVLDQL 
35 TDPPGVKRVYHIQPSLEDPFQPPSIPITVYYAVLERACRSVLLHAPSEAPQIVRGASDEA 
RKHTYNLTI AWYRMGDNCAI PITVMEYTECPYNKSLGVCPIRTQPRWS YYDSFSAVSEDN 
LGFLMHAPAFETAGTYLRLVKINDWTEITQFI LEHRARASCKYALPLRI PPAACLTSKAY 
QQGVTVDS IGMLPRFI PENQRTVALYSLKIAGWHGPKPPYTSTLLPPELSDTTNATQPEL 
WEDPEDSALLEDPAGTVSSQIPPNWHIPSIQDVAPHHAPAAPSNPGLIIGALAGSTLAV 
40 LVIGGIAFWVRRRAQMAPKRLRLPHIRDDDAPPSHQPLFY* 

Gene matched: gi | 419141 |pir | | E43674 

Gene name: US 6 protein - human herpesvirus 2 <st 

45 
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[SEQ ID NO: 16] 
ORF # = 8 from Contig 101 
ORF start site = 7553 
5 ORF end site = 7440 
ORF sequence: 

VGGLCLMI LGMACLLEVLRRLGRELARCC PHAGQFAP * 

10 

Gene matched: gi | 137132 | sp| P13293 |VGLJ_HSV2 
Gene name: GLYCOPROTEIN J. gi|419140|pir 

15 [SEQ ID NO: 17] = Contig ID 102 

f SEQ ID NO: 18] 
ORF # = 1 from Contig 102 
ORF start site = 1502 
20 ORF end site =465 
ORF sequence: 

VCPPPPTNMAWCGSGLRLRPFHPPSPSFFVLRALIRAGPGPFAASPRAPSGPGCGMCRG 
DSPGVAGGSGEHCU^DDGDIX3RPRIAOTGAIARGFAHLWLQATTLGFVGSVVLSRGPYA 
DAMSGAFVIGSTGLGFLRAPPAFARPPTRVCAWLRLVGGGAAVALWSLGEAGAPPGVPGP 
25 ATQCLALGAAYAALLVLADDVHPLFLLAPRPLFVGTLGVWGGLTIGGSARYWWIDPRAA 
AALTAAWAGLGTTAAGDSFSKACPRHRRFCWSAVESPPPRYAPEDAERPTDHGPLLPS 
THHQRSPRVCGDGAARPENIWVPVVTFAGALALAACAARGWWERS * 

30 Gene matched: gi | 136909 | sp | P10227 |UL43_HSV11 
Gene name: MEMBRANE PROTEIN UL43 . gi|73 

[SEQ ID NO: 19] 
ORF # = 2 from Contig 102 
35 ORF start site = 2996 
ORF end site = 1584 
ORF sequence: 

MAHLPGGAAAAPLSEDAIPSPRERTEDWPPCQIVLQGAELNGILQAFAPLRTSLLDSLLV 
VGDRGILVHNAIFGEQVFLPLDHSQFSRYRWGGPTAAFLSLVDQKRSLLSVFRANQYPDL 

40 RRVELTVTGQAPFRTLVQRIWTTASDGEAVELAS ETLMKRELTS FAVLLPQGDPDVQLRL 
TKPQLTKWNAVGDETAKPTTFELG PNGKFSVFNARTCVTFAAREEGAS S STS AQ VQ I LT 
SALKKAGQAAANAKTWGENTHRTFSVVVDDCSMRAVLRRLQVGGGTLNFFLTADVPSVC 
VTATGPNAVSAVFLLKPQRVCLNWLGRTPGSSTGSLASQDSRAGPTDSQDFSSEPDAGDR 
GAPEEEGLEGQARVPPAFPEPPGTKRRHAGAEWPADDATKRPKTGVPAAPTRAESPPLS 

45 ARYG PEAAEGGGDGGRYAWYFRDLQTGDAS PS PLS AFRG PQRP PYGFGLP * 
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Gene matched: gi | 136905 | sp | P1022 6 |VPAP_HSV11 
Gene name: POLYMERASE ACCESSORY PROTEIN 

5 

[SEQ ID NO:20) 
ORF # = 3 from Contig 102 
ORF start site = 3490 
10 ORF end site = 4152 
ORF sequence: 

MGLFGMMKFAQTHHLVKRRGLRAPEGYFTPIAVDLWNVMYTLVVKYQRRYPSYDREAITL 
HCLCSMLRVFTQKSLFP I FVTDRGVECTEPVVFGAKAI LARTTAQCRTDEEASDVDAS P P 
PFPHHRLQAQFPPFQHAPPRARLRPGGPGERGPPAQARRPPGARPRSRPCAWLTCSVSAF 
15 CGRWGTPTSTRVSWRPTTPARTSIIPTRSRTCIPRIPISC* 



Gene matched: gi | 549322 |sp| P3 6699 |VHS_HSV2G 
Gene name: VIRION HOST SHUTOFF PROTEIN. 

20 

[SEQ ID NO: 21) 
ORF # = 4 from Contig 102 
ORF start site = 4122 
25 ORF end site ■ 4970 
ORF sequence: 

VHTTDTDLLLMGCDIVLDISTGYIPTIHCRDLLQYFKMSYPQFLALFVRCHTDLHPNNTY 
ASVEDVLRECHWTAPSRSQARRGARRERANSRSLESMPTLTAAPVGLETRISWTEILAQQ 
I AGEDDYEEDPPLQPPDVAGGPRDGARS S S S EI LTPPELVQVPNAQRVAEHRGYVAGRRR 
30 HVIHDAPEALDWLPDPMTIAELVEHRYVKYVISLISPKERGPWTLLKRLPIYQDLRDEDL 
ARSIVTRHITAPDIADRFLAQLWAHAPPPAFYKDVLAKFWDE* 

Gene matched: gi | 549322 | sp| P3 6699 |VHS_HSV2G 
35 Gene name: VIRION HOST SHUTOFF PROTEIN. 

[SEQ ID NO: 22) 
ORF # = 5 from Contig 102 
ORF start site = 6266 
40 ORF end site = 5253 
ORF sequence: 

MDPAVSPASTDPLDTHASGAGAAPIPVCPTPERYFYTSQCPDINHLRSLSILNRWLETEL 
VFVGDEEDVSKLSEGELGFYRFLFAFLSAADDLVTENLGGLSGLFEQKDILHYYVEQECI 
EWHSRVYNI IQLVLFHNNDQARRAYVARTINHPAIRVKVDWLEARVRECDS I PEKFILM 
45 ILIEGVFFAASFAAIAYLRTNNLLRVTCQSNDLISRDEAVHTTASCYIYNNYLGGHAKPE 
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AARVYRLFREAVDI EIGFIRSQAPTDS S ILSPGALAAIENYVRFS ADRLLGLI HMQPLYS 
APAPDASFPLSLMSTDKHTNFFECRSTS YAGAWNDL * 



5 Gene matched: gi 1 132624 | sp | P0.3 174 |RIR2_HSV2 3 
Gene name: R I BONUCLEOS IDE- DIPHOSPHATE R 



[SEQ ID NO: 23] 
10 ORF # a 6 from Contig 102 
ORF start site = 9861 
ORF end site = 6319 
ORF sequence: 

VIRRPVRPFGRTAHPASHGPAAVSVHRVRATVTLVPMANRPAASALAGARSPSERQEPRE 

1 5 PEVAPPGGDHVFCRKVSGVMVLSSDPPGPAAYRISDSSFVQCGSNCSMI I DGDVARGHLR 
DLEGATSTGAFVAISNVAAGGDGRTAWALGGTSGPSATTSVGTQTSGEFLHGNPRTPEP 
QGPQAVPPPPPPPFPWGHECCARRDARGGAEKDVGAAESWSDGPSSDSETEDSDSSDEDT 
GSGSETLSRSSSIWAAGATDDDDSDSDSRSDDSVQPDVWRRRWSDGPAPVAFPKPRRPG 
• DSPGNPGLGAGTGPGSATDPRASADSDSAAHAAAPQAEVAPVLDSQPTVGTDPGYPVPLE 

20 LTPENAEAVARFLGDAVDREPALMLEYFCRCAREESKRVPPRTFGSAPRLTEDDFGLLNY 
ALAEMRRLC LDLPPVPPNAYTPYHLREYATRLVNGFKPLVRRS ARLYRI LG I LVHLRI RT 
REASFEEWMRSKEVDLDFGLTERLREHEAQLMILAQALNPYDCLIHSTPNTLVERGLQSA 
LKYEEFYLKRFGGHYMESVFQMYTRIAGFLACRATRGMRHIALGRQGSWWEMFKFFFHRL 
YDHQIVPSTPAMLNU5TRNYYTSSCYLVNPQATTNQATLRAITGm/SAIIJ\RNGGIGLCM 

25 QAFNDAS PGTAS IMPALKVLDSLVAAHNKQSTRPTGAC VYLEPWHS DVRAVLRMKGVLAG 
EEAQRCDNIFSALWMPDLFFKRLIRHLDGEENVTWSLFDRDTSMSLADFHGEEFEKLYEH 
LEAMGFGETI PIQDLAYAIVRSAATTGS PFIMFKDAVNRHYI YNTQGAAIAGSNLCTEIV 
HPSSKRSSGVCNLGSVNIJVRCVSRRTFDFGMLRDAVQACVIjMVNIMIDSTLQPTPQCARG 
HDNLRSMGIGMQGLHTACLKMGLDLESAEFRDLNTHIAEVMLLAAMKTSNALCVRGARPF 

30 SHFKRSMYRAGRFHWERFSNASPRYEGEWEMLRQSMMKHGLRNSQFIALMPTAASAQISD 
VSEGFAPLFTNLFSKVTRDGETLRPNTLLLKELERTFGGKRLLDAMDGLEAKQWSVAQAL 
PCLDPAHPLRRFKTAFDYDQELLIDLCADRAPYVDHSQSMTLYVTEKADGTLPASTLVRL 
LVHAYKRGLKTGMYYCKVRKATNSGVFAGDDNIVCTSCAL * 

35 

Gene matched: gi | 330199 

Gene name: (M12700) ribonucleotide reductase large subunit ( 

40 

[SEQ ID NO:24] 
ORF # = 7 from Contig 102 
ORF start site = 11144 
ORF end site = 10323 
45 ORF sequence: 
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VRRRLRCARRRRGGPGPHHDQLRRDAGRGAAGPVFRMPARHGPHARVSPRGHAVFRGASV 
WTQDELASVTAVCSGPQEATHTGHPGRPCSAVTIPACAFVDLDAELCLGGPGAAFLYLV 
FTYRQCRDQELCCVYWKSQLPPRGLEAALERLFGRLRITNTIHGAEDMTPLPPNRNVDF 
PLAVLAASS QS PRCS ASQVTNPQFVDRLYRWQ PDLRGRPTARTCT YAAFAELGVMPDNS P 
5 RCLHRTERFGAVGVPWILEGWWRPGGWRACA* 



Gene matched: gi 1 139176 | sp | P22486 |VP19_HSV2G 
Gene name: CAPSID ASSEMBLY AND DNA MATU 

10 

[SEQ ID NO:25J 
ORF # = 8 from Contig 102 
ORF start site = 11722 
ORF end site = 10667 

15 ORF sequence: 

MKTKPLPTAPMAWAESAVETTTSPREIAGHAPLRRVLRPPIARRDGPVLIX3DRAPRRTAS 
TMWLLG IDPAES S PGTRATRDDTEQAVDKI LRGARRAGGLTVPGAPRYHLTRQVTLTDLC 
QPNAERAGALLLALRHPTDLPHLARHRAPPGRQTERLAEAWGQLLEASALGSGRAESGCA 
RAGLVS FNFLVAAC AAAYDARDAAEAVRAH I TTNYGGTRAG ARLDRFS ECLRAMVHTHVF 

20 PHEV>1RFFGGLVSWSHRTSWLASPPSAADPRRPHTPATRAGPVRPLPSRPAPLWTWTPSC 
AWGALGRRSCTWFSPTDSAGTRSSVACTWSRASSPRADWRRPSSGCSGASG* 



Gene matched: gi | 139176 | sp| P22486 |VP19_HSV2G 
25 Gene name: CAPSID ASSEMBLY AND DNA MATU 



[SEQ ID NO: 26] = Contig ID 103 

30 [SEQ ID NO: 27 ] 

ORF # = 1 from Contig 102 
ORF start site = 3308 
ORF end site = 693 
ORF sequence: 

35 MAETMNVATCTHQTHHAARAPGATSAPGAASGDPLGARRPIGDDECEQYTSSVSLARMLY 
GGDLAEWVPRVHPKTTI ERQQHGPVTF PDASAPTARCVTVVRAPMGSGKTTALIRWLGEA 
IHSPDTSVLWSCRRSFTQTLATRFAESGLPDFVTYFSSTNYIMNDRPFHRLIVQVESLH 
RVGPNLLNNYDVLVLDEVMSTLGQLYS PTMQQLGRVDALMLRLLRTC PRI I AMDATANAQ 
LVDFLCSLRGEKNVHWIGEYAMPGFSARRCLFLPRLGPEVLQAALRPPGPAGGAPPPDA 

40 PPDATFFGEVEARLAGGDNVCIFLSTVSFAEWARFCRQFTDRVLLLHSLTPPGEVTTWG 
RYRWIYTTVVTVGLSFDPPHFDSMFAYVKPMNYGPDMVSVYQSLGRVRTLRKGELLIYM 
DGSGARSEPVFTPMLLNHWSASGQWPAQFSQVTNLLCRRFKGRCDASHADAAQARGSRI 
YSKFRYKHYFERCTLACLADSLNILHMLLTLNCMHVRFWGHDAALTPRNFCLFLRGIHFD 
ALRAQRDLRELRCQDPDTSLSAQAAETEEVGLFVEKYLRPDVAPAEWALMRGLNSLVGR 

45 TRF I YLVLLEACLRVPMAAHS SAI FRRLYDHYATGVI PTINAAGELELVALHPTLNVAPV 
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WELFRLCSTl^CLQWDSI^GGSGRTFSPEDVLELLNPHYDRYMQLWELGHCI^DGPL 
LSEDAVKRVADALSGCPPRGSVSETEHALSLFKIIWGELFGVQLAKSTQTFPGAGRVKNL 
TKRAIVELLDAHRIDHSACRTHRQLYALLMAHKREFAGARFKLRAPAWGRCLRTHASGAQ 
PNTDI ILEAALSELPTEAWPMMQGAVNFSTL* 

5 

Gene matched: gi | 136806 | sp| P10193 |UL09_HSV11 
Gene name : ORIGIN OF REPLICATION BINDIN 

10 [SEQ ID NO:28] 

ORF # = 2 from Contig 103 
ORF start site = 3160 
ORF end site = 4590 
ORF sequence: 

15 VYCSHSSSPMGRRAPRGSPEAAPGADVAPGARAAWWV7WCVQVATFIVSAICWGLLVLAS 
VFRDRFPCLYAPATSYAEANATVEWGGVAVPLRLDTQSLLATYAITSTLLLAAAVYAAV 
GAVTSRYERALDAARRLAAARMAMPHATL IAGNVCAWLLQ ITVLLLAHRI SQLAHLI YVL 
HFACLVYLAAHFCTRGVLSGTYLRQVHGLIDPAPTHHRIVGPVRAVMTNALLLGTLLCTA 
AAAVSLNTIAAI^FNFSAPSMLICLTTLFALLWSLLLWEGVLCHYVRVLVGPHLGAIA 

20 ATGIVGLACEHYHTGGYYVVEQQWPGAQTGVRVALALV^ 

HTKFFVRMRDTRHRAHSALRRVRS SMRGSRRGGPPGDPGYAET PYAS VSHHAEI DRYGDS 
DGDPIYDEVAPDHEAELYARVQRPGPVPDAEPIYDTVEGYAPRSAGEPVYSTVRRW* 

25 Gene matched: gi | 136810 | sp| P042 88 |VIMP_HSV11 
Gene name: PROBABLE INTEGRAL MEMBRANE P 



[SEQ ID NO:29] 
30 ORF # = 3 from Contig 103 
ORF start site = 6853 
ORF end site = 4784 
ORF sequence: 

MAAAATPGAKRPADPARDPDSPPKRPRPNSLDLATVFGPRPAPPRPTSPGAPGSHWPQSP 
35 PRGQPDGGAPGEKARPASPALSEASSGPPTPDIPLSPGGAHAIDPDCSPGPPDPDPMWSA 
SAI PNALPPH ILAETFERHLRGLLRGVRS PLAIGPLWARLDYLC SLWSLEAAGMVDRGL 
GRHLV^LTRRAPPSAAEAVAPRPLMGFYEAATQNQADCQLWALLRRGLTTASTLRWGAQG 
PCFSSQWLTHNASLRLDAQSSAVMFGRVNEPTARNLLFRYCVGRADAGVNDDADAGRFVF 
HQPGDLAEENVHACGVLMDGHTGMVGASLDI LVCPRDPHGYLAPAPQT PLAFYEVKCRAK 
40 YAFDPADPGAPAASAYEDLMARRSPEAFRAFIRS I PNPGVRYFAPGRVPGPEEALVTQDR 
DWLDSRAAGEKRRCSAPDRALVELNSGWSEVLLFGVPDLERRTISPVAWSSGELVRREP 
I FANPRHPNFKQ I LVQGNVPRQPLSRLPPATAPGDVPRQAPRGRGGGRDVPPGGRPRSAR 
RAWRGPRTRQGIDPPGPGRSDRPDHHPRPRRAGDIPGHPAKQPPGLRRYARQVMGLAFSG 
ARPCCCRHNVIITDGGEWSLTAHEFDWDIESEEEGNFYVPPDMRWTRAPGPQYRRAS 
45 DPPSRHTRRRDPDVARPPATLTPPLSDSE* 
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Gene matched: gi 1 119694 | sp| P0 64.89 |EXON_HSV2 
Gene name: ALKALINE EXONUCLEASE. gi|3302 

5 

[SEQ ID NO: 30] 
ORF # = 4 from Contig 103 
ORF start site = 5313 
10 ORF end site = 4990 
ORF sequence: 

VTFLGRHRAGAEEGVTFRLEDGRGAPAGRGGAPGPAKAS ILPDQAVPI ALI ITPVRVE PG 
IYRDIRRNSRLAFDDTLAKLWASRSPGRGPAAADTTSSSPTAGRSSR* 

15 

Gene matched: gi | 330252 

Gene name: (M11854) 1 . 9kD ORF [Herpes simplex virus type 2] 



20 [SEQ ID NO: 31) 

ORF # = 5 from Contig 103 
ORF start site = 8477 
ORF end site = 6894 
ORF sequence: 

25 VGGRRPGGRMDESGRQRPASHVAADI S PQGAHRRS FKAWLAS YI HSLSRRASGRPSG P S P 
RDGAVSGARPGSRRRSSFRERLRAGLSRWRVSRSSRRRSSPEAPGPAAKLRRPPLRRSET 
AMTS P PS P PSH ILSLARI HKLCI PVFAVNPALRYTTLEI PGARSFGGSGGYGEVQLI C EH 
KLAVKTIREKEWFAVELVATLLVGECAFCGGRTHDIRGFITPLGFSLQQRQIVFPAYDMD 
IX3KYIGQLASLRATTPSVATALHHCFTDIJtflAvVFLOT 

30 VSLRRAVLADFSLVTLNSNSTISRGQFCLQEPDLESPRGFGMPAALTTANFHTLVGHGYN 
QPPELLVTCYLNNERAEFNNRPLKHDVGIAVT)LYAI^ 

GYQYFNNQLSPDFAVALLAYRCVLHPALFVTJSAETNTHGLAYDVPEGIRRHLRNPKIRRA 
FTEQCINYQRTHKAVLSSVSLPPELRPLLVLVSRLCHANPAARHSLS* 

35 

Gene matched: gi | 125628 | sp| P04290 |KR2_HSV11 
Gene name: PROBABLE SERINE /THREONINE -PRO 

40 [SEQ ID NO: 32 ] 

ORF # = 6 from Contig 103 
ORF start site = 8113 
ORF end site = 8352 
ORF sequence: 

45 MAVSDLRRGGRLSLAAGPGASGDERRRDERLTRHRDSPARSRSRKLDRRRDPGRAPETAP 
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SRGEG PLGRPDARRLRECM * 

Gene matched: gi | 93505 |pir| | B34768 
5 Gene name: ORF5 protein - Orf virus (strain NZ2) 



[SEQ ID NO: 33] 
ORF # = 7 from Contig 103 
10 ORF start site = 8863 
ORF end site = 8204 
ORF sequence: 

MSRDASHAALRRRLAETHLRAEVYRDQTLQLHREGVSTQDPRFVGAFMAAKAAHLEL^ 
LKSRARLEMMRQRATCVK I RVEEQAARRDFLTAHRRYLDPALS ERLDAADDRLADQEEQL 
1 5 EEAAANASLWGDGDLMX^SPGDSDLLVm/QLTSAPKVHTDAPSRPGSRPTYTPSAAGR 
PDAQAAPPPETAPSPEPAPGPAADPASGSGFARDCPDGE* 

Gene matched: gi | 136823 | sp| P04291 |UL14_HSV11 
20 Gene name: HYPOTHETICAL UL14 PROTEIN, g 

[SEQ ID NO: 34 ) 
ORF # = 8 from Contig 103 
ORF start site = 8749 
25 ORF end site = 10242 
ORF sequence: 

VYSRPPGVAAGSGPCTPRPGGASRPNVGAGPRGWRLGSSRRPRARPTSDSFAPTPLTSAA 
PASPAMFGQQLASDVQQYLERLEKQRQQKVGVDEASAGLTLGGDALRVPFLDFATATPKR 
HQTVVPGVGTLHDCCEHSPLFSAVARRLLFNSLVPAQLRGRDFGGDHTAKLEFLAPELVR 

30 AVARLRFRECAPEDAVPQRNAYYS VXNTFQALHRSEAFRQLVHFVRDFAQLLKTS FRAS S 
LAENTGPPKKRAKTOVATHGQTYGTLELFQKMILMHATYFLAAVLLGDHA^ 
FEIPLFSDTAVRHFRQRATVFLVPRRHGKTWFLVPLIALSLASFRGIKIGYTAHIRKATE 
PVFDEIDACLRGWFGSSRVDHVXGETISFSFPDGSRSTIVFASSHNTNVSTPSSRGACFP 
GAALPEIDRQTNTARRECGTTRPQPPPPWRGE^LFICNRTMRLWPRPARPRGSSLQTGG 

35 WYTMTERRGATRRWSGG * 



Gene matched: gi | 74013 |pir | |WMBE31 

Gene name: 38K protein - human herpesvirus 1 gi|5 

40 

[SEQ ID NO: 35] 
ORF # = 9 from Contig 103 
ORF start site = 11332 
45 ORF end site = 10115 
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ORF sequence.: 

VFLFHRSPTPPPKSYTRWPLCFWCVSGPFPTTNMAQRAVWRPQGTPGPPGAAAPPGHRGA 
PPDARAPDPGPEADLVARIANSVFVWRVVRGDERLKIFRCLTVLTEPLCQVALPDPDPER 
ALFCEI FLYLTRPKALRLPSNTFFAI FFFNRERRYCATVHLRSVTHPRTPLLCTLAFGHL 
5 EAASPPEETPDPAAEQLADEPVAHELDGAYLVPTEPPPNPGACCALGPGAWWHLPGGRIY 
CWAMDDDLGSLC PPGSRARHLGWLLSRI TDPPGGGGACAPTAHI DSANALWRAPAVAEAC 
PCVAPCMWSNMAQRTLAWGDASLCQLLFGHPVDAVILRQATRRPRITAHLHEVVVGRDG 
AESVI RPTSAGWRLCVLS SYTSRLFATSC PAVARAVARASSSDYK * 

10 

Gene matched: gi | 136829 | sp| P10200 |UL16_HSV11 
Gene name: PROTEIN UL16. gi |73879 |pir | | 

15 [SEQ ID NO:36] 

ORF # = 10 from Contig 103 
ORF start site = 12706 
ORF end site = 11336 
ORF sequence: 

20 FLTG YFRVHG I DKLDQRAVQDVTRRH P VRAR PKHAASGVXS GLRQGALVHXAVSGGALG A 
SDAEAVLAGLEPPGGGRFATPGGPRAAGDDVLNDVLTLVPGTAKPRSLVEV^DRGWEPLA 
GGDRPDWLWSRRSISVVLRHHYGTKQRFVWSYKNSVAWGGRRTRPPLLSSYLATALTEA 
CAAERWRPHQLSPAAQTALLRRFPALEGPLRHPRPVLQPFDIAAEVAFVARIQIACLRA 
LGHSIRAALQGGPRIFQRLRYDFGPHQSEMX3EVTRRFPVLLENLMRALEGTAPDAFFHT 

25 AYALAVLAHLGGQGGRGRRRRLVPLSDDIPARFADSDAHYAFDYYSTSGDTLRLTNRPIA 
WIDGDVNGREQSKCRFMEGSPSTAPHRVCEQYLPGESYAYLCLGFNRRLCGLVVFPGGF 
AFT INTAAYLSLADPVARAVGLRFCRGAGTG PGLVR* 



30 Gene matched: gi | 136835 | sp| P10201 |UL17_HSV11 
Gene name: PROTEIN UL17. gi | 73875 |pir| | 



35 [SEQ ID NO: 37] = Contig ID 104 

[SEQ ID NO: 38] 
ORF # = 1 from Contig 104 
ORF start site = 3027 
40 ORF end site =262 
ORF sequence: 

VSGRAGDPAGLPAPRGGPTWPMPSGGPPPEVKAGLRADMWGVMGQYREAXEHQTPDTETV 
VAGMH PALVWLKTMFXDAPETPVLVQFFSDHA PT I AKAVSNAINAGSAAVATAS PAATV 
DAAVRAHGALADAVSAIX3AAARDPAS PL SFLAALADSAAG YVKATRLA^ DKLTT 
45 LGSAAADLVFHARRACAQPEGDHAALIDAAARATTAARESLAGHEAGFGGLLHAEGTAGD 
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HSPSGRALQELGKVIGATRRRAEELEAAVADLTGKMAAQRARGSSERWAAGVEAALDRVE 
NRAEFDVTOLRRLQALAGTHGYNPRDFRKRAEQALAANAEAVT^^ 
RHPMLPPLAAIHRLGWSAAFHAAAETYADMFRVDAEPIiARLLRIAEGLLEMAQAGDGFID 
YHEAVGRLADDOTSVPGLRRWPFFQHGYADYVELRDRLDAIRADVHRALGGVPLDLAAA 
5 AEQI SAARNDPEATAELVRTGVTLPC PS EDALVACAAALERVDQS PVKNTAYAEYVAFVT 
RQDTAETKDAWRAKQQRAEATERWAGLREALAARERRAQIEAEGLANL 
ATOAKTLDQARSVAEIADQVEVLLDQTEKTRELDVPAVIWLEHAQRTFETHPLSAARGDG 
PGPLARHAGRLGALFDTRRRVDALRRSLEEAEAEWDEVWGRFGRVRGGAWKSPEGFRAMH 
EQLRALQDTTNTVSGLRAQPAYERLSARYQGVLGAKGAERAEAVEELGARVTKHTALCAR 
10 LRDEVVRRVPWEMNFDALGRLKAEFDAAAADIAPWAVEEFRGARELIQYRMGLYSAYARA 
GGKALFLFFFFPPPLSSFLPHFHFFIHHHHSFTKFFTSSSLHSYHLFPSSIYSIPSISPL 
YPHSSLSFPSSQFLHIFLSLP* 

15 Gene matched: gi | 135576 | sp |P10220 |TEGU_HSV11 
Gene name: LARGE TEGUMENT PROTEIN (VIRI 

[SEQ ID NO:39] 
ORF # = 2 from Contig 104 
20 ORF start site = 3914 
ORF end site = 2901 
ORF sequence: 

VMPVAPPPRGAGGRAPC PPALGPEAIHARLEDVRI QARRAI ESAIKEYFHRGAVYS AKAL 
QASDSHDCRFHVASAAWPMVQLLESLPAFDQHTRDVAQRAALPPPP PLATS PQAILLRD 
25 LLQRGQTLDAPEDLAAWLSVLTDAATQGLIERKPLEELARSIHGINDQQARRSSGLAELQ 
RFDALDAAIjAQQLDSDAAFVPATGPAPYVDGGGLSPEATRMAEDALRQARAMEAAKMTAE 
LAPEARSRLRERAHALEAMLNDARERAKVAHDAREKFLHKLQGVLRPLPDFVGLKACPAV 
LATLRASLPRGVDRPGRCRPGAPPRKSRRGCGRTCGG* 

30 

Gene matched: gi| 221757 

Gene name: (D10879) virion protein [Herpes simplex virus typ 

35 [SEQ ID NO:40] 

ORF # = 3 from Contig 104 
ORF start site = 6099 
ORF end site = 3643 
ORF sequence: 

40 VVTGVRNQFATDLEPGGSVSCMRSSLSFLSLLFDVGPRDVLSAEAIEGCLVEGGEWTRAA 
AGSGPPRMCSIIELPNFLEYPAARGGLRCVFSRVYGEVGFFGEPTAGLLETQCPAHTFFA 
GPWAMRPLSYTLLTIGPIX3MGLYRIX3DTAYLFDPHGLPAGTPAFIAKV^AGDVYPYLTYY 
AHDRPKVRWAGAMVFFVPSGPGAVAPADLTAAALHLYGAS ETYLQDEPFVERRVAI TH PL 
RGEIGGLGALFVGWPRGDGEGSGPWPALPAPTHVQTPRADRPPEAPRGASGPPNTPQA 

45 GHPNRPPDDVWAAALEGTPPAKPSAPDAAASGPPHAAPPPQTPAGDAAEEAEDLRVLEVG 
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AVPVGCHRARYSTGLPKRRRPTWTPPSSVEDLTSGERPAPKAPPAKAKKKSAPKKKAPVA 
AEVPASSPTPIAATVPPAPDTPPQSGQGGGDDGPASPSSPSVLETLGARRPPEPPGADLA 
QLFEVH PNVAATAVRLAARDAALAREVAACSQLTI NALRS PY PAHPGLLELC VI FFFERV 
LAFLIENGARTHTQAGVAGPAAALLDFTLRMPPRKTAVGDFLASTRMSLADVAAHRPLIQ 
5 HVLDKNSQIGRIALAKLVLVARDFIRETTDAFYGDLADLDLQLRAAPPANLYAI^ 

RSRAHPNTLFAPATPTHPEPLLHRIQALAHFARGKKMRVEAEAREMREALYAIiARGVYSV 
SQRAGPPDRDARCPPPPGRRRQGPVPARPGPRGHPCAAGGRADPGPPGDRKRDQGVLPPG 
SRIQREGPAGQRQ PRLSVSRGLGRGRAHGPVAG I ATGL* 

10 

Gene matched: gi | 221757 

Gene name: (D10879) virion protein [Herpes simplex virus typ 

[SEQ ID NO: 41] 
15 ORF # = 4 from Contig 104 
ORF start site = 6751 
ORF end site = 6269 
ORF sequence: 

MNAHFANEVQYDLTRDPSSPASLIHVIISSECLAAAGVPLSALVRGRPDGGAAANFRVET 
20 QTRPHAPGDCTPWRSAFAAYVPADAVGAILAPVIPAHPDLLPRVPSAGGLFVSLPVACDA 
QGVYDPYTVAALRLAWGPWATCARVLLFSYDELTRYRVCG* 

Gene matched: gi | 136835 | sp| P10201 |UL17_HSV11 
25 Gene name: PROTEIN UL17. gi | 73875 |pir| | 



[SEQ ID NO: 42] 
30 ORF # = 5 from Contig 104 
ORF start site = 6781 
ORF end site = 8052 
ORF sequence: 

VPEGAWVGGACARPRGPRAHVRLYAVCFVCPQGIRGQDFNLLFVDEANFIRPDAVQTIMG 
35 FLNQANCKI I FVS STNTGKASTSFLYtHJlGAADELLNWTYICDDHMPRVvTHTNATACS 
CYII^KPVFITMDGAVRRTADLFLPDSFMQEIIGGQARETGDDRPVLTKSAGERFLLYRP 
STTTNSGLMAPELYVYVDPAFTANTRASGTG I AWGRYRDDF 1 1 FALEHFFLRALTGSAP 
ADIARCVVHSI^QVIALHPGAFRSTOVAV^GNSSQDSAVAIATHVHTEMHRILASAGANG 
PGPELLFYHCEPPGGAVIjYPFFLLNKQKTPAFEYFIKKFNSGGvTIASQEL^ 
40 PV^YLSEQLNNLI ETVSPNTDVRMYSGKRNGAADDLMVAVIMAI YLAAPTGI PPAFFPIT 
RTS* 

Gene matched: gi | 13 9646 | sp | P04295 |VTER_HSV11 
45 Gene name: PROBABLE DNA PACKAGING PROTE 
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[SEQ ID NO:43] 
ORF # = 6 from Contig 104 
5 ORF start site = 9483 
ORF end site = 8392 
ORF sequence: 

VLLSPAPPPLPHGRCPPSLFHHRPGCVALSGPPAPPRSGVSRPGAMITDCFEADIAIPSG 
ISRPDAAALQRCEGRVVFLPTIRRQLAI^VAHESFVSGGVSPDTLGLLLAYRRRFPAVI 
10 TRVLPTRIVAC PVDLGLTHAGTVNLRNTS PVDLCNGDPVSLVP PVFEGQATDVRLESLDL 
TI^PVPLPTPLAREIVARLVARGIRDLNPDPRTPGELPDLNVLYYNGARLSLVADVQQL 
ASVNTELRSLVLNMVYSITEGTTLILTLIPRLLALSAQIX5YVNALLQMQSVTREAAQLIH 
PEAPMLMQIX3ERRLPLYEALVAWLAHAGQLGDILALAPAVRVCTFDGAAVVQSGDMAPVI 
RYP* 

15 

Gene matched: gi | 139191 | sp | P10202 |VP23_HSV11 
Gene name: CAPSID PROTEIN VP23 . gi|7387 

20 

[SEQ ID NO: 44] 
ORF # = 7 from Contig 104 
ORF start site = 13917 
ORF end site = 9727 

25 ORF sequence: 

WEGIX^LPE^LMEPANPPRNPMAAPARDPPGYRYAAAMVPTGSILSTIEVASHRRLFDF 
FARVRSDENSLYDVEFDALLGSYCNTLSLVRFLELGLSVACVCTKFPELAYMNEGRVQFE 
VHQPLIARDGPHPVEQPVHNYMTKVIDRRALNAAFSLATEAIALLTGEALDGTGISLHRQ 
LRAIQQLARNVQAVLGAFERGTADQMLHVLLE 

30 LVAELKRSFCDTSFFLGKAGHRREAI EAWLVDLTTATQPSVAVPRLTHADTRGRPVDGVL 
VTTAAIKQRLLQSFLKVEDTEADVPVTYGEMVLNGANLVTALVMGKAVRSLDDVGRHLLE 
MQEEQLEANRETLDELESAPQTTRVRADLVAIGDRLVFLEALEKRIYAATNVPYPLVGAM 
DLTFVLPLGLFN PAMERFAAHAGDLVPAPGHPE PRAFPPRQLFFWGKDHQVLRLSMENAV 
GTVCH PSLMN I DAAVGGVNHDP VEAANPYGAYVAAPAG PG ADMQQRFLNAWRQRLAHGRV 

35 RWVAECQMTAEQFMQPDNANLALELH PAFDFFAGVADVELPGGEVPPAGPGAI QATWRW 
NGNLPLALC PVAFRDARGLELGVGRHAMAPATI AAVRGAFEDRS YPAVFYLLQAAI HGSE 
HVFCALARLVTQCITSYWNNTRCAAFVNDYSLVSYI^ 

AIAQLVDDFTLPGPELGGQAQAEI^I^DPALLPPLVWDCDGLMRHAALDRHRDCRIDA 
GGHEPVYAAACWATADFNRNDGRLLHNTQARAADAADDRPHRPADWTVHHKIYYYVLVP 
40 AFSRGRCCTAGVRFDRVYATLQNMVVPEIAPGEECPSDPWDPAHPLHPANLVANTVNAM 
FHNGRVWDGPAMLTLQVIxAHNMAERT^ 

LMA PQHLDHTI QNGEYFYVLPVHALFAGADHVANAPNF PPALRDLARHVPLVP PALGANY 
FS S I RQPWQHARESAAGENALTYALMAGYFKMS PVALYHQLKTGLH PGFGFTWRQDRF 
VTENVLFSERASEAYFLGQLQVAR11ETGGGVSFTLTQPRGNVDLGVGYTAVAATATVRNP 
45 VTDMGNLPQNFYLGRGAPPLLNNAAAVYLRNAWAGNRLGPAQPLPVFGCAQVPRRAGMD 
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HGQDAVCEFIATPVATDINYFRRPCNPRGRAAGGVYAGDKEGDVIALMYDHGQSDPARPF 
AATANPWASQRFS YGDLLYNGAYHLNGAS PVLS PCFKFFTAADITAKHRCLERL IVETGS 
AVSTATAASDVQFKRPPGCRELVEDPCGLFQEAYPITCAS DPALLRSARDGEAHARETH F 
TQYLI YDAS PLKGLSL * 

5 

Gene matched: . gi 1 137571 1 sp| P06491 |VCAP_HSVll 
Gene name:' MAJOR CAPSID PROTEIN (MCP) ( 

10 

[SEQ ID NO:45) 
ORF # = 8 from Contig 104 
ORF start site * 14832 
ORF end site = 14164 
15 ORF sequence: 

MTMRDDVPLLDRELVYEAACGGEDGEIiPLDEQFSLSSYGTSDFFVSSAYSRLPPHTQPVF 
SKRWMFAWSFLVLKPLELVAAGMYYGWTGRAVAPACI I AAVLAYYVTWLARALLL YVNI 
KRDRLPLSPPVFWGLCVIMGGAALCALVAAAHETFSPIX3LFHWITASQLLPRTDPLRARS 
LG I AC AAGAAMWVAAADCFAAFTNFFLARFWTRAI LKAPVAF * 

20 

Gene matched: gi 1 136841 1 sp| P102 04 1 UL20_HSV11 
Gene name: MEMBRANE PROTEIN UL20. gi|73 

25 

[SEQ ID NO: 46] 
ORF # = 9 from Contig 104 
ORF start site = 15168 ■ 
ORF end site = 17081 

30 ORF sequence: 

VGRQGERWVGGGNEKNTQRATSGMRPELSLKGRPCVTEAWCPSTDAAIHSGGSSSVRPQ 
PYARAARARATHGSRSRHRQPLLPPPSSHHPTIPPPPSPPRGSPAMELTYATTLHHRDW 
FYVTADRNRAYFVCGGSVYSVGRPRDSQPGEIAKFGLVVRGTGPKDRMVANYVRSELRQR 
GLREVRPVGEDEVFLDSVC LLNPNVS S ERDVINTNDVEVLDECLAEYCTSLQT S PGVLVT 

35 GVRVRARDRVT ELFEHPAI VNI SSRFAYTPSPYVFALAQAHLPRLPSSLE PLVSGLFDGI 
PAPRQ PLDARDRRTDWI TGTRAPRPMAGTGAGGAGAKRATVSEFVQVKH I DRWS PS VS 
SAPPPSAPDASLPPPGLQEAAPPGPPLRELWWVFYAGDRALEEPHAESGLTREEVRAVHG 
FREQAWKLFGSVGAPRAFLGAAIiAL S PTQKLAVYYYLIHRERRMS PFPALVRLVGRY I QR 
HGLWPAPDEPTI^AMNGLFRDALAAGTVAEQLLMFDLLPPKDVPVGSDARADSAALLR 

40 FVDSQRLTPGGSVSPEHVMYLGAFLGVLYAGHGRIAAATHT?;^ 
SAFDRGPAGAAGRTRTAGYLDALLTVCLARAQHGQSV* 

Gene matched: gi 1 136845 | sp|P102 05 |UL21_HSV11 
45 Gene name: PROTEIN UL21. gi 1 73866 |pir | | 
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[SEQ ID NO: 47] 
ORF # = 10 from Contig 104 
ORF start site = 19116 
5 ORF end site = 17302 
ORF sequence: 

VYLS PSALKWPVGVWTTGGLAFGCDAALVRARYGKGFMG WI SMRDS P PAEI I WPADKT 
LARVGNPTDENAPAVLPGPPAGPRYRVFVLGAPTPADNGSALDALRRVAGYPEESTNYAQ 
YMSRAYAEFLGEDPGSGTDARPSLFWRLAGLLASSGFAFVNAAHAHDAIRLSDLLGFLAH 

10 SRVI^GLAARGAAGCAADSWLWSVLDPAARLRL 

QLAFVLDS PAAYGAVAPS AARLI DAL YAEFLGGRAIiTAPMVRRALFYATAVLRAPFLAGA 
PSAEQRERARRGLLITTALCTSDVAAATHADLRAALARTDHQKNLFWLPDHFSPCAASLR 
FDLAEGGFI LDALAMATRSDI PADVKAQQTRGVASVLTRWAHYNALIRAFVPEATHQCSG 
PSHNAEPRILVPITHNASYWTHTPLPRGIGYKLTGVDVRRPLFITYLTATCEGHAREIE 

1 5 PKRLTOTENRRDLGLVGAVFLRYTPAGEVMSVLLVDTDATQQQLAQGPVAGTPNVFSSDV 
PSVALLLFPNGTOIHLIAFDTLPIATIAPGFIJ^SAU^AmiTAAIAGILRVVRTCVPFL 
WRRE* 

20 Gene matched: gi 1 138316 | sp | P08356 |VGLH_HSV1E 
Gene name: GLYCOPROTEIN H PRECURSOR, gi 



[SEQ ID NO:48] 
25 ORF # = 11 from Contig 104 
ORF start site = 20070 
ORF end site = 19117 
ORF sequence: 

VSISAGVRGQGWHRISTPPKNGAGRSVLVFGLVLPLCFYPHPTPSFGPRLRQQRASDSLR 
30 GAEPLWAVGTDTPPSADWQPGRTTMGPGLWWMGVLVGVAGGHDTYWTEQIDPWFLHGLG 
LARTYWRDTNTGRLWLPNTPDASDPQRGRLAPPGELNLT 

AEFPRDPGQLLYIPKTYLLGRPRNASLPELPEAGPTSRPPAEVTQLKGLSHNPGASALLR 
SRAVATTFAAAPDREGLTFPRGDDGATERHPDGRRNAPPPGPPAGTPRHPTTNLSIAHLHN 
ASVTWLAARGLLRTPGR* 

35 

Gene matched: gi | 364588 | prf | | 1508243Y 
Gene name: UL22 gene [Human herpesvirus 1) 

40 

[SEQ ID NO:49] 
ORF # = 12 from Contig 104 
ORF start site = 21285 
ORF end site = 20155 
45 ORF sequence: 
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MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASEARGDPELPTLLRVTIDGPHGV 
GKTTTSAQLMEALGPRDNIVYVPEPMTYWQVLGASETLTNIYNTQHRLDRGEISAGEAAV 
VMTSAQITMSTPYAATDAVLAPHIGGEAVGPQAPPPALTLVFDRHPIASLLCYPAARYLM 
GSMTPQAVIiAFVALMPPTAPGTNLVLGVLPEAEHADRLARRQRPGERLDIiAMLSAIRRVY 
5 DLLANTVRYLQRGGRWREDWGRLTGVAAATPRPDPEDGAGSL PRI EDTLFAL FRVPELLA 
PNGDLYHI FAWVLDVLADRLLPMHLFVLDYDQS PVGCRDALLRLTAGMI PTRVTTAGS I A 
EIRDLARTFAREVGGV* 

0 Gene matched: gi | 59823 

Gene name: (VO0466) thymidine kinase [Herpes simplex virus ty 



[SEQ ID NO: 50] 

ORF # = 13 from Contig 104 

ORF start site = 20968 

ORF end site = 22032 

ORF sequence: 

VXRVTOVTIQGIXXSPQHLPVSHRLGDVTO 

QLRIPAGFGGPIAMARTGRRAAVGRPARTSSLTERRRVLIJVGvTlSHTRFYKAFAREVREF 
NATRICGTLLTLMSGSLQGRSLFEATRVTLICEVDLGPRRPDCICVFEFANDKTLGGVCV 
ILELKTCKSISSGDTASKREQRTTGMKQLRHSLKLLQSLAPPGDKVVYLCPILVTVAQRT 
LRVSRVTRLVPQKISGNITAAVRMLQSLSTYAVPPEPQTRRSRRRVAATARPQRPPSPTR 
DPEGTAGHPAPPESDPPSPGWGVAAEGGGVLQKIAALFCVPVAAKSRPRTKTE* 



Gene matched: gi | 136854 |sp| P10208 |UL24_HSV11 
Gene name: PROTEIN UL24 . gi | 74056 |pir| | 



[SEQ ID NO: 51] 
ORF # = 14 from Contig 104 
ORF start site = 22313 
ORF end site = 23893 
ORF sequence: 

MDPYYPFDALDWEHRRFIVADSRSFITPEFPRDFWMLPVFNIPRETAAERAAVMQAQRT 
AAAAAL EN AALQAAEL PVD I ERR I R P I EQQVHH I ADALEAL ETAAAAAEEADAARDAEAR 
GEGAADGAAPSPTAGPAAAEMCTQIVRNDPPLRYDTNLPVDLLH1WYAGRGAAGSSGWF 
GTWYRTIQERTIADFPLTTRSADFRDGRMSKTFMTALVLSLQSCGRLYVGQRHYSAFECA 
VljCLYLLYRTTHESSPDRDRAPVAFGDLLARLPRYLARLAAVIGDESGRPQYRYRDDKLP 
KAQFAAAGGRYEHGALATHWIATLVRHGVLPAAPGDVPRDTSTRVNPDDVAHRDDVNRA 
AAAFLARGHNLFLWEDQTLLRATANT I TALAVLRRLLANGNVYADRLDNRLQLGMLI PGA 
VPAEAIARGASGLDSGAIKSGDNNLEALCWYVLPLYQADPTVELTQLFPGAGRPVPGRP 
GGAATGVDEARGGYWGRP PGGARAPHRAGAHQ PH PHKHHPCGGDY * 
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Gene matched: gi | 136863 | sp | P10209 |UL25_HSV11 
Gene name: VIRION PROTEIN UL25. gi|7406 

5 

[SEQ ID NO: 52] 
ORF # = 15 from Contig 104 
ORF start site = 23784 
ORF end site = 24071 
10 ORF sequence: 

WDMLSGARQAALVRLTALEL INRTRTNTTPVGEI INAHDALG I QYEQGLGLLAQQARIG 
LASNAKRFATFNVGSDYDLLYFLCLGF I PQYLSVA* 



15 Gene matched: gi | 136863 | sp | P10209 |UL25_HSV11 
Gene name: VIRION PROTEIN UL25. gi|7406 



[SEQ ID NO: 53] 
20 ORF # = 16 from Contig 104 
ORF start site = 24292 
ORF end site = 25638 
ORF sequence: 

VRVPMASAEMRERLEAPLPDRAVPIWAGFLALYDSGDPGELALDPDTVRAALPPENPLP 
25 INVDHRARCEVGRVIAVWDPRGPFFVGLIACVQLERVIiETAASAAIFERRGPALSREER 
LLYLITNYLPSVSLSTKRRGDEVPPDRTLFAHVALCAIGRRLGTIVTYDTSLDAAIAPFR 
HLDPATREGVRREAAEAELALAGRTWAPGVEALTHTLLSTAVNNMMLRDRWSLVAERRRQ 
AG I AGHTYLQASEKFK IWGAESAPAPERGYKTGAPGAMDTS PAAS VPAPQVAVRARQVAS 
SSSSSSSFPAPADMNPVSASGAPAPPPPGDGSYLWIPAFHYNQLVTGQSAPHHPPLTACG 
30 LPAAGTVAYGHPGAG PS PHYP P P PAHPYPGYAVRGPQS PGGPDRRAGGGHRRRPPGGWAS 
GGRRRPRDPGVGEPPPTRGGAAGVRLRP* 



Gene matched: gi | 1224097 
35 Gene name: (U49329) UL26 protease [Human herpesvirus 2] 



[SEQ ID NO:54] 
ORF # = 17 from Contig 104 
40 ORF start site = 25463 
ORF end site = 26221 
ORF sequence: 

MLFAGPSPLEAQIAALVGAIAADRQAGGLPAAAGDHGIRGSANRRRHEVEQPEYDCGRDE 
PDRDFPYYPGEARPEPRPVTDSRRAARQASGPHETITALVGAVTSLQQELAHMRARTHAPY 
45 GPYPPVGPYHHPHADTETPAQPPRYPAEAVYLPPPHIAPPGPPLSGAVPPPSYPPVAVTP 



58 



WO 98/20016 



PCT/US97/20016 



GPAPPLHQPSPAHAHPPPPPPGPTPPPAASLPQPEAPGAEAGALVNASSAAHVNVDTARA 
ADLFVSQMMGSR* 

5 Gene matched: gi | 1224097 

Gene name: {U49329) UL26 protease [Human herpesvirus 2 J 



10 [SEQ ID NO: 55] = Contig ID 14 



(SEQ ID NO: 56] 
ORF # = 1 from Contig 14 
15 ORF start site = 665 
ORF end site = 787 
ORF sequence : 

VKYQPRKLGKFKFNNLRDCGLYQRSPLQKFARLDIQPLLH* 

20 

Gene matched: gi | 76474 |pir| |JQ0950 

Gene name: ICP 18.5 protein - infectious laryngot 

25 

[SEQ ID NO: 57] = Contig ID 38 

30 [SEQ ID NO: 58] 

ORF # = 1 from Contig 38 
ORF start site = 273 
ORF end site = 43 
ORF sequence: 

35 V^LTAAQGVXPVSVDSTSGDRAQLNNNNNNNDDDYNNKKQLKPLQTQTTLSHSFEVSSGS 
PNTEVEIGERTDYLLK* 

Gene matched: gi | 1020200 
40 Gene name: (U31782) minor capsid protein L2 [Human papillom 



[SEQ ID NO: 59] = Contig ID 50 

45 
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[SEQ ID NO: 60] 
ORF # = 1 from Contig 50 
ORF start site =3 65 
ORF end site = 3 
5 ORF sequence: 

MSRRS PRRRG PRRRPRPGG PTVPRPGAFPTADSQMVPAYDSGTAVESAPAASSLLRRWLL 
VPQADDSDDADYAGNDDAEWANS PPSEGGGKAPEAPHAAPASAC PPPPPRKERGQQRPLP 
X 

10 

Gene matched: gi 1 132753 | sp| P28283 |RL1_HSV2H 
Gene name: NEUROVI RULENCE FACTOR (ICP34. 



15 [SEQ ID NO: 61) = Contig ID 53 



[SEQ ID NO: 62] 
ORF # = 1 from Contig 53 
20 ORF start site = 754 
ORF end site = 380 
ORF sequence: 

VETAHARMYPDAPPLRLCRGANVRYRVRTRFGPDTLVPMSPREYRRAVLPALDGRAAASG 
AGDAMAPGAPDFCEDEAHSHRACARWGLGAPLRPVYVALGRDTVRGGPADLLGPRREFCA 
25 RALL* 

Gene matched: gi | 124141 | sp | P08392 | ICP4_HSV11 
Gene name: TRANS -ACTING TRANSCRIPTIONAL 

30 



[SEQ ID NO: 63]. = Contig ID 67 

35 [SEQ ID NO:64] 

ORF # = 1 from Contig 67 
ORF start site = 487 
ORF end site =26 
ORF sequence: 

40 VSDGQHQATVXXEVQASEPYIRVANGFGLWPQGGQGTIDTXELHXDTNLDIRSGDEVHY 
HVTAGRRWGQLLWATQS VTAFSQEDLLDGAI F YRLNGSLRTRDTLI FSMEMGPVHTDAT I 
QVTVALEGPLAPLKLVRHKKI YVFXGRGSWGI L * 

45 Gene matched: gi | 560570 | bbs | 151525 
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Gene name: chondroitin sulfate proteoglycan NG2=t 

ISEQ ID NO: 65] 
5 ORF # = 2 from Contig 67 
ORF start site = 353 
ORF end site = 511 
ORF sequence: 

VELXCVNGALTSLRDHKAKTVGHTDVGLRGLHLXXHSGLVLPIGHTCSWIQP* 



Gene matched: gi | 1079684 

Gene name: (U39205) Lpel4p [Saccharomyces cerevisiae) 

15 

[SEQ ID NO: 66] = Contig ID 74 



20 [SEQ ID NO: 67] 

ORF # = 1 from Contig 74 
ORF start site = 224 
ORF end site = 412 
ORF sequence: 

25 MITGLDNNVCYPITQFAIYNRLTCDKTYRIMPEYAHEAMNVFVNDQVYNWLCGSEIPFKY 
LK* 

Gene matched: gi| 550075 
30 Gene name: (D10935) cephalosporin-C deacetylase [Bacillus su 

[SEQ ID NO: 68] = Contig ID 76 

35 

[SEQ ID NO: 69] 
ORF # = 1 from Contig 76 
ORF start site = 111 
ORF end site = 1 
40 ORF sequence: 

MALTEDASSDSPTSAPEKTPLPVSATAMDQAYRYSXX 



45 Gene matched: gi | 138297 | sp | P132 90 |VGLG_HSV2 
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Gene name: GLYCOPROTEIN G. gi 1 419139 | pir 
[SEQ ID NO: 70) = Contig ID 82 

5 

[SEQ ID NO:71] 
ORF # = 1 from Contig 82 
ORF start site = 767 
10 ORF end site = 1156 
ORF sequence: 

VALAPYVNKTVTGDCL PVLDMETGH IGAYWLVDQTGNVADLLRAAAPAWSRRTLL PEHA 
RNCVRPPDYPTPPASEWNSLWMTPVGNMLFDQGTLVGALDFHGLRSRHPWSREQGAPAPA 
GDAPAGHGE* 

15 

Gene matched: gi | 124135 | sp | P2 8284 | ICP0_HSV2H 
Gene name: TRANS -ACTING TRANSCRIPTIONAL 

20 

[SEQ ID NO: 72] = Contig ID 87 



[SEQ ID NO: 73 I 
25 ORF # = 1 from Contig 87 
ORF start site = 519 
ORF end site = 1475 
ORF sequence: 

MLNDMQWLASSDSEEETEVGISDDDLHRDSTSEAGSTDTEMFEAGLMDAATPPARPPAER 
30 QGS PTPADAQGSCGGGPVGEEEAEAGGGGDVCAVCTDEIAPPLRCQSFPCLHPFC I PCMK 
TWI PLRNTCPLCNTPVAYLI VGVTASGSFSTI PIVNDPRTRVEAEAAVRSGTAVDFIWTG 
NPRTAPRSLSLGGHTVRALSPTPPWPGTDDEDDDPPDGEGGRGSGTGRGSGTGRGSGTGR 
GSGTGRGSGGGQALTGGSRLCLPLQPELI SRPPPNTS PPGAAVPGPPLVTPPPLLPNLRP 
PAPPGTTLTRGPPFLGRGF 

35 

Gene matched: gi 1 124135 |sp|P2 8284 | ICP0_HSV2H 
Gene name: TRANS - ACTING TRANSCRIPTIONAL 

40 

[SEQ ID NO: 74] = Contig ID 89 
45 [SEQ ID NO: 75] 
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ORF # = 1 from Contig 89 
ORF start site = 259 
ORF end site = 615 
ORF sequence: 
5 MLADRWRKHTDGNWYVJFDNSGEMATG 

EGAMVSNAFIHSAGRNRLVLPQTRPNTGRQARIHSRPRWLDYVKIIMECLSNQNPAYY* 

Gene matched: gi | 113676 | sp| P06653 |ALYS_STRPN 
10 Gene name: AUTOLYSIN (N-ACETYLMURAMOYL- 



[SEQ ID NO: 76] = Contig ID 90 

15 

[SEQ ID NO: 77] 
ORF # = 1 from Contig 90 
ORF start site = 507 
20 ORF end site = 2702 
ORF sequence: 

VKTIKSMDMPVATSFLAPDGTPLQYALCFPAVTDKLGALLMRPEAACVRPPLPTDVLESA 
PTVTAMYVLTVVNRLQLALSDAQAANFQLFGRFVRHRQATWGASMDAAAELWALVATTL 
TREFGCRWAQLGWASGAAAPRPPPGPRGSQRHCVAFNENDVLVALVAGVPEHIYNFWRLD 

25 LVRQHETiTMHLTLERAFEDAAESMLFVQRLTPHPDARIRVLPTFLDGGPPTRGLLFGTRLA 
DWRRGKLSETDPLAPWRSAIjELGTQRRDAPALGKLSPAQALAAVSVLGRMCLPSAALAAIj 
WTCMFPDDYTEYDSFDALUVARLESGQTLGPAGGREASLPEAPHALYRPTGQHVAVLAAA 
THRTPAARVTAMDLVIJ^VLIXSAPVWALRNTTAFSRESELELCLTLFDSRPGGPDAALR 
DWS SDI ETWAVGLLHTDLNPI ENACLAAQLPRLSALI AERPLADGPPCLVLVDI SMT PV 

30 AVLV^PEPPGPPDVRFVGSEATEELPFVATAGDVLAASAADADPFFARAILGRPFDASL 
LTGELFPGHPWQRPLADEAGPSAPTAARDPRDLAGGDGGSGPEDPAAPPARQADPGVLA 
PTFLTDATTGEPVPPRMWAWIHGLEELASEDAGGPTPNPAPALLPPPATDQSVPTSQYAP 
RPIGPAXTARETRPSVPPQQNTGRVPVAPRXDPRPSPPTPSPPADAAVPPPAFSGFAAAF 
SAAVPRVRRSRR 

35 

Gene matched: gi | 135576 | sp| P10220 |TEGU_HSV11 
Gene name: LARGE TEGUMENT PROTEIN (VIRI 

40 

[SEQ ID NO: 78] = Contig ID 91 



[SEQ ID NO: 79] 
45 ORF # = 1 from Contig 91 
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ORF start site = 364 
ORF end site = 2751 
ORF sequence: 

VCPPPTGATWQFEQPRRCPTRPEGQNYTEGIAWFKENIAPYKFKATMYYKDVTVSQVW 
5 FGHRYSQFMGIFEDRAPVPFEEVIDKINAKGVCRSTAKYVRNNMETTAFHRDDHETDMEL 
KPAKVATRTSRGWHTTDLKYN PSRVEAFHRYGTTVNC I VEEVDARSVYPYDEFVLATGDF 
VYMS PFYGYREGSHTEHTS YAADRFKQVDGFYARDLTTKARATS PTTRNLLTTPKFTVAW 
DV^KRPAVCTMTKWQEVDEMLRAEYGGSFRFSSDAISTTFTTNLTQYSLSRVDLGDCIG 
RDAREAI DRMFARKYNATH IKVGQPQYYLATGGFLI AYQPLLSNTLAELYVREYMREQDR 
0 KPRNATPAPLREAPS ANASVERI KTTS S I EFARLQFTYNHIQRHVNDMLGRI AVAWCELQ 
NHELTLWNEARKLNPNAIASATVGRRVSARMIX3DVMAVSTCVPVAPDNVIVQNSMRVSSR 
PGTCYSRPLVSFRYEDQGPLIEGQLGENNELRLTRDALEPCTVGHRRYFIFGGGYVYFEE 
YAYSHQLSRADVTTVSTFIDLNITMLEDHEFVPLEVYTRHEIKDSGLLDYTEVQRRNQLH 
DLRFADIDTVIRADANAAMFAGLCAFFEGMGDLGRAVGKVVMGVVGGVVSAVSGVSSFMS 
5 NPFGALAVGLLVIAGLVAAFFAFRYVLQLQRNPMKALYPLTTKELKTSD PGGVGGEG EEG 
AEGGGFDEAKLAEAREMIRYMXLVSAMERTEHKARKKGTSALLSSKWNMVLRKRNKARY 
SPLHNEDEAGDEDEL* 

0 Gene matched: gi 1 138198 | sp | P0 6763 |VGLB_HSV2 3 
Gene name: GLYCOPROTEIN B PRECURSOR . gi 



[SEQ ID NO: 80] = Contig ID 93 



[SEQ ID NO: 813 
ORF * = 1 from Contig 93 
ORF start site = 533 
ORF end site = 1678 
ORF sequence: 

VALFVPLRLGWDPQTGLVVRVERASWGPPAAPRAAL 

RLAWARLAAIRNS PQCAS S ASLAVT I TTRTARFARETYTTLAFPPTSKEGAFADLVEVCEV 
CLRPRGHPHRVTARVLLPRGYNYFVSAGDGFSAPALVALFRQWHTTVHPAPGALAPVFAF 
LGPGFEVRGGPLQYFAVLGFPGWPPFTVPAAAAAESVRDLLRGAACTHPLCPGGPGPRWA 
PRSSCPRGHGRPWPRRRPAASCPPFGKRWRGGTPRPPPSNYSTPRRPSGRSGRRGFVSPG 
SRPSSWPPSRASGRPGCRKPGGGRAWKGWTRWWRPPPRSPGPGPCWSAWCRTRATPAPRS 
GSCSAGSWPPSACRSSRRPAR* 

Gene matched: gi | 136802 | sp | P10192 |UL08_HSV11 
Gene name: PROTEIN UL 8 . gi | 73829 1 pir | | W 
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[SEQ ID NO: 82] 
ORF # = 2 from Contig 93 
ORF start site = 1288 
5 ORF end site = 2448 
ORF sequence: 

VASEAAGRI^PAFREAVARWHPTATTIQLL^ 

GEAGLPEARGRAGLERLDALVAAAPS EPWARAVLERLVPDACDAC PALRQLLGGVMAAVC 
LQIEQTASSVKFAVCGGTGAAFWGLFNVDPGDADAAHGAIHDARRALEASVRAVLSANGI 
1 0 RPRIAPSLALEGVYTHVVTWSQTGAWFWNSRDDTDFLQG^^ 

ILRRPAAGPPEEAVCAARGIMEDACDRFVLDAFGRRLDAEYWSVLTPPGEADDPLPQTAF 
RGGALLDAEQYWRRWRVC PGGGES VGVPVDLYPRPLVLPPVDCAHHLRE I LRE IQLVFT 
GVLEGVWGEGGSFVYPFEEKMRFLFP* 

15 

Gene matched: gi (136802 |sp|P10192|UL08_HSVll 
Gene name: PROTEIN UL8 . gi | 73829 |pir | |w 

20 

tSEQ ID NO: 83] 
ORF # = 3 from Contig 93 
ORF start site = 3631 
ORF end site = 2705 

25 ORF sequence: 

VRRTRAGASNAGMADPTPADEGTAAAILKQAIAGDRSLVEVAEGISNQALLRMACEVRQV 
SDRQPRFTATSVLRVDVTPRGRLRFVLDGSSDDAYVASEDYFKRCGDQPTYRGFAVWLT 
ANEDHVHSLAVPPLVLLPIRLSLFRPTDLRDFELVCLLMYLENCPRSHATPSLFVKVSAWL 
GWARHASPFERVRCLLLRSCHWILNTLMCMAGWPFDDELVLPHWYMAHYLLANNPPPV 

30 LSALFCATPQSFALQLPGPVPRTDCVAYNPAGWGSCV^SKDLRSALVYWWLSGSPKRRT 
SSLFYRFC* 



Gene matched: gi | 136798 | sp| P10191 |UL07_HSV11 
35 Gene name: PROTEIN UL7 . gi | 73828 |pir | |W 



tSEQ ID NO: 84] 
ORF # = 4 from Contig 93 
40 ORF start site = 4286 
ORF end site = 3570 
ORF sequence: 

MSPATQLQARDRELRRAQAGALEREHRAADRAAGGGAGRPAEADLLRADYDIIDVSKSMD 
DDTYVANSFQHQYIPAYGQDLERLSRLWEHELVRCFKILRHRNNQGQETSISYSSGAIAS 
45 FVAPYFEYVLRAPRAGALITGSDVILGEEELWEAVFKKTRLQTYLTDVAALFVADVQHAA 



65 



WO 98/20016 



PCT/US97/20016 



LPRPPSPTPADFRASASPRGGSRSRTRTRSRS PGRTPRGAPDQGWGVQRRDGRPHARR * 

Gene matched: gi | 136794 | sp| P10190 |UL06_HSVll 
5 Gene name: VIRION PROTEIN UL6. gi | 73994 



[SEQ ID NO: 85] = Contig ID 94 

10 

[SEQ ID NO: 86] 
ORF # = 1 from Contig 94 
ORF start site = 3669 
ORF end site = 496 

15 ORF sequence: 

PRLSRAYLRHARGFEGSPGDTYPLRIGRRQSFPFGPAVSAPRRRARTPVAMSDSALQVPA 
PAGMTP PSAP P PNG PLQVLLGSLTNLRR P PS PSSEPAGSADEPAFLSAAKLRAATAAFLL 
SGAAVGPAEARACWH PLLEQLCALHRAHGL PETAL LAENL PGLLVHRMAVAL PETPEAAF 
REMDVI KDTVLAI TGSDTTHALEAAGLRTTAALG PVRVRQCAVEWI DRWRTVTQ SCLAMN 

20 PRTSLEALGEMSLKMSPVPLGQPGANLTTPAYSLLFPSPIVQEGLRFLALVSNWVTLFSA 
HLQRIDDAALTPLTRALFTLALVDEYLTTPDRGAVVPPPLLAQFQHTVREIDPAIMIPPL 
EATKMVRSREEVRVSTALSRVSPRSACAPPGTLMARVRTDAAVFDPDVPFLSASALAIFR 
PAVTGLLQLGEPPSAGAQQRLLALLQQTWALVQNSNSPSWINTLTDAGFTPAHCTQYIS 
ALEGFLVAGVPARTPPGHGLSEIQQLFGC IALAGANVFGLAREYGHYAGYVKTFRRI QGA 

25 SEHTHGRLCEAVGLSGGVLSQTLARIMGPAVPTEHLASLRRTLVGEFETAERRFSAGQPS 
LLRETALIWLDVYGQTHWDLTPTTPATPLSALLPVGPPSHAPSVHLAAATKIRFPALEGI 
HPNVIADPGFVPYVLALWGDALRATCNAAYLPRPIEFALRVIiAWARDFGLGYLPW 
RTKLGALITLLE PATRAGVGPTMQMADNI EQLLRELYVI ARGAVEQLRPAVQLPPPQP PE 
VGSSLLLISMYAIxAARGWQEFAERADPLVRQLEDAIVLLRLHMRTLAAFFECRFESEXSH 

30 RLYAWADAHERLGPWRPEAMGDAVSQYCGMYHDAKRALVASLAGLRSVVTETTAHLGVC 
DELAAQVSHEGNVIAVVRREIHGFI^IVSGIHARASKLWSGDQVPGFCYMSQFLAR 
SAGYQAARAATGPERVAEFVQELHDTWKGLQTERALWAPFASSGDQRTAAIQEVMAHAN 
EDAPPARPQTRRAHKRHDWGAGXTXXGAWVXDWXDS * 

35 

Gene matched: gi | 221758 

Gene name: (D10879) OL37 [Herpes simplex virus type 1} 

40 [SEQ ID NO: 87] = Contig ID 95 

[SEQ ID NO: 88] 
ORF # = 1 from Contig 95 
45 ORF start site = 371 
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ORF end site = 18 
ORF sequence: 

VLLDAPAPTASGRTKTPAQGLAKEVQFSTAP PS PTAPWT PRVAGFNKRVFCAAVGRLAAT 
HARIAAVQLWDMSRPHTDGDLNELLDLTTIRVTVCEGKNLLQRANELVNPDAAQGIL* 



Gene matched: gi 1 136927 | sp| P10233 |UL49_HSV11 
Gene name: TEGUMENT PROTEIN UL49. gi|73 

10 

[SEQ ID NO: 89] 
ORF # = 2 from Contig 95 
ORF start site = 831 
15 ORF end site = 436 
ORF sequence: 

MTSRRSVKSCPREAPRGTHEELYYGPVSPADPESPRDDFRRGAGPMRARPRGEVRFLHYD 
EAGYALYRDSSSSEDNDESRDTARPRRSASVAGSHGPGPARAPPPPGGPVGAGGRSHAPP 
ART PKMTRG AP * 

20 

Gene matched: gi 1 136927 | sp| P10233 |UL49_HSV11 
Gene name: TEGUMENT PROTEIN UL49. gi|73 

25 [SEQ ID NO: 90] 

ORF # = 3 from Contig 95 
ORF start site = 1441 
ORF end site = 2550 
ORF sequence: 

30 MSQWGPRAILVQTDSTNRNADGDWQAAVAIRGGGVVQLNMVNKRAVDFTPAECGDSEWAV 
GRVSLGLRMAMPRDFCAI I HAPAVSGPGPHVMLGLVHSG YRGTVLAVWS PNGTRGFAPG 
ALRVDVTFLDIRATPPTLTEPSSLHRFPQLAPSPLAGLREDPWLDGALATAGGAVALPAR 
RRGGSLVYAGELTQVTTEHGDCVHEAPAFLPKREEDAGFDILIHRAVTVPANGATVIQPS 
LRVLRAADGPEACYVIXIRSSLNARGLLVMPTRW 

35 QLLVAGTHALPWI PPDNIHEDGAFRAYPRGVPDATATPRDPPILVFTNEFDADAPPSKRG 
AGGFGSTGI * 



Gene matched: gi| 118955 | sp| P10234 |DUT_HSV11 
40 Gene name: DEOXYURI DINE 5 ' -TRIPHOSPHATE 



[SEQ ID NO: 91] 
45 ORF # = 4 from Contig 95 
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ORF start site = 3535 
ORF end site = 2756 
ORF sequence: 

VGWGKAGAEPRACSGMASLLG^IXGWGTRPEEQQYEMIRAADPPSEAEPRLQEALAVVNA 
5 LLPAPITLDDALESLDDTRRLVKARALARTYHACMVNLERLARHHPGLEGSTIIDGAVA^ 
RDKMRRLADTCMATILQMYMSVGAADKSADVLVSQAIRSMAESDVVMEDVAIAERALGLS 
TSALAGGTRTAGLGATEAPPGPTRAQAPEVASVPVTHAGDRSPVRPGPVPPADPTPDPRH 
RTSAPKRQASSTEAPLLLA* 

10 

Gene matched: gi| 136933 | sp| P10235 |UL51_HSV11 
Gene name: PROTEIN UL51. gi | 73813 |pir| | 

15 

[SEQ ID NO: 92] 
ORF # = 5 from Contig 95 
ORF start site = 2889 
ORF end site = 5042 

20 ORF sequence: 

VTGTDATSGACALVGPGGASVAPS PAVRVP PARAEVERPRARS AI ATS SMTTSL SAMLRM 
AWETSTSADLSAAPTDMYICRMVAMHVSARRRILSRCAATAPSMVEPSSPGWWRASLSRL 
TMQAWYVRARARAFTRRRVSSSDSRAS SSVMGAGKS ALTTARASC SRGS AS EGGSAARI I 
SYCCSSGRVPQPHSTPSRDAI PEHARGSAPAFPHPTPSGFAGAMGTEDCDHEGRSVAAPV 

25 EJVMALYATDGCVITSSLALLTNCLLGAEPLYIFSYDAYRPDAPNGPTGAPTEQERFEGSR 
ALYRDAGGLNGDSFRVTFCLLGTEVGVTHHPKGRTRPMFVCRFERADDVAVLQDALGRGT 
PLLPAHITATLDLEATFALHANIIMALTVAIVHNAPARIGSGSTAPLYEPGESMRSWGR 
MSLGQRGLTTLFVHHKARVLAAYRRAYYGSAQSPFWFLSKFGPDKKSLVLAARYYLLQAP 
RLGGAGATYDLQAVKDI CATYAI PHDPRPDTLSAASLTSFAAITRFCCTSQ YSRGAAAAG 

30 FPLYVERRIAADVRETGALEKFIAHDRSCLRVSDREFITYIYLAHFECFSPPRLATHLRA 
VTTHDPSPAASTEQPSPLGREAVEQFFRHVRAQLNIREYVKQNVTPRETALAGDAAAAYL 
RARTYAPAALTPAPAYCGVADSSTKMMGRLAEAERLLVPHGWPAFAPTTPGDDAGGGI 

35 Gene matched: gi | 136939 | sp | P10236 |UL52_HSV11 
Gene name: DNA REPLICATION PROTEIN UL52 



[SEQ ID NO:93] = Contig ID 96 



[SEQ ID NO: 94] 
ORF # = 1 from Contig 96 
ORF start site = 2599 
45 ORF end site = 1064 
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ORF sequence: 

VGGCVDKLPLLKTPGPVARGARWLARATRRMACRKFCGVYRRPDKRQEASVPPETNTAPA 
FPASTFYTPAEDAYLAPGPPETIHPSRPPSPGEAARLCQLQEILAQMHSDEDYPIVDAAG 
AEEEDEADDDAPDDVAYPEDYAEGRFLSMVSAAPLPGASGHPPVPGRAAPPDVRTCDSGK 
5 MGATGFTPEELDTMDREALRAISRGCKPPSTLAKLVTGLGFAIHGTLIPGSEGCVFDSSH 
PNYPHRVIVKAGWYASTNHEARLLRRI^PAILPLLDLHWSGVTCLVLPKYHCDLYTYL 
SKRPSPLGHLQITAVSRQLLSAIDYVHCEGIIHRDIKTENILINTPENICLGDFGAACFV 
RGCRSSPFHYGIAGTIDTNAPEVLAGDPYTQVIDIWSAGLVIFETAVHTASLFSAPRDPE 
RRPCDNQI ARI IRQAQVHVDEF PTHAESRLTAHYRSRAAGNNRPAWTRPAWTRYYKIHTD 
1 0 VEYLICKALTFDAALRPSAAELLRLPLFHPK* 



Gene matched: gi | 125617 | sp| P13287 |KR1_HSV2 
Gene name: SERINE /THREONINE- PROTEIN KINAS 

15 



[SEQ ID NO: 95] 
ORF # = 2 from Contig 96 
20 ORF start site = 2795 
ORF end site = 3373 
ORF sequence: 

MGVVWSVVTLLNQRNALPRTSADASPALWSFLLRQCRILASEPLGTPVVVRPANLRRLA 
EPLMDLPKFTRPIVRTRSCRCPPNTTTGLFAEDDPLESIEILDAPACFRLLHQERPGPHR 
25 LYHLWWGAADLCVPFLEYAQKTRLGFRFIAMKTNDAWVGEPWPLPDRFLPERTVSWTPF 
PAAPNHPLGKSP* 



Gene matched: gi | 137125 | sp | P13292 | US02_HSV2 
30 Gene name: PROTEIN US2 . gi | 419137 |pir| | A 



[SEQ ID NO: 96) 
ORF # = 3 from Contig 96 
35 ORF start site = 3534 
ORF end site = 3671 
ORF sequence: 

MGRPEIPDEPSWQTGDDDPQNPGPPLAVGDEWPPSSHVCYPITNL* 

40 

Gene matched: gi | 137125 | sp | P13292 |US02_HSV2 
Gene name: PROTEIN US2 . gi | 419137 |pir| | A 

45 
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[SEQ ID NO: 97] 
ORF # = 4 from Contig 96 
ORF start site = 5400 
ORF end site = 3853 
5 ORF . sequence : 

VGRMRVGERERGKKKKEGRRRRKREGGEGKGKEEEGGEEX3EVREKGERDRGGGEGGGREK 
RGEKGDGGGGPRSQHPRFI AGRAPPSWTGHRCGNWRQGVATMADI PPDPPAVNTT PANHA 
PPSPPPGSRKRRRPVLPSSSESEGKPDTESESSSTESSEDEAGDLRGGRRRSPRELGGRY 
FLDLSAESTTGTESEGTGPSDDDDDDASIX^VDTPPRKSKRPRINLRLTSSPDRRAGVV 
10 FPEVWRNDRPIRAAQPQAPAQSSGDRAAAPRRSARQAQMRSGAAOTLDLHYIRQCVNQLF 

rilraapnppgsanrlrhlvrix:ylmgycrtrlgprtwgrllqisggtwdwlrnairw 
earfepaaepvcelpclnarryg pecdvgnletnggstsddei sdatdsddtlashsdte 
ggpspagrenpesasggaiaarlecefgtfdwtseegsqpwlsawadtssaersglpap 
gacrateaperedgcrkmrfpaacpypcghtflrp* 

15 

Gene matched: gi | 124184 | sp | P04485 | IE68JHSV11 
Gene name: IMMEDIATE -EARLY PROTEIN IE68 

20 

[SEQ ID NO: 98] = Contig ID 98 

25 [SEQ ID NO: 99] 

ORF # = lfrom Contig 98 
ORF start site = 612 
ORF end site = 872 
ORF sequence: 

30 MVMAAC PTEPPGGSVGPADQPRVQS SRTWRP PLVNSRELYRAQRAARCASS SDTPQAPGW 
CGGTCRHAVFG WAVWVI I LAFLWR * 



Gene matched: gi (136952 | sp| P28282 |UL56_HSV2H 
35 Gene name: PROTEIN UL5 6. gi | 73833 |pir | | 

[SEQ ID NO: 100] 
ORF # = 2 from Contig 98 
40 ORF start site = 1689 
ORF end site = 1045 
ORF sequence: 

MWGPGPARFIARPGTHGRRVFTDPPPRNMTTTPLSNLFLRAPDITHVAPPYCLNATWQAE 
NALHTTKTDPACLAARSYLVRASCSTSGPIHCFFFAVYKDSQHSLPLVTELRNFADLVNH 
45 PPVLRELEDKRGGRLRCTGPFSCGTIKDVSGASPAGEYTINGIVYHCHCRYPFSKTCWLG 
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ASAALQHLRFISSSGTAARAAEQRRHKIKIKIKV* 

Gene matched: gi | 136947 | sp | P28281 |UL55_HSV2H 
5 Gene name: PROTEIN UL55. gi | 73806 |pir| | 

[SEQ ID NO:101] 
ORF # = 3 from Contig 98 
10 ORF start site « 2705 
ORF end site = 1821 
ORF sequence: 

MALSLTPPHADGRAPVPERKAPSADTIDPAVRAVLRSISERAAVERISESFGRSALVMQD 
PFGGMPFPAANSPWAPVLATQAGGFDAETRRVSWETLVAHGPSLYRTFAANPRAASTAKA 
1 5 MRDCVLRQENLIEALASADETLAWCKMCIHHNLP 

QCYLKARGLCGLDDLCSRRRLSDIKDIASFVLVIIJUU^ANRVERGVSEIDYTTVGVGAGE 
TMHFYI PGACMAGLIEILDTHRQECSSRVCELTASHTI APLYVHGKYFYCNSLF * 



20 Gene matched: gi | 124181 | sp | P28276 | IE63_HSV2H 
Gene name: TRANSCRIPTIONAL REGULATOR IE 



25 [SEQ ID NO: 102] 

ORF # = 4 from Contig 98 
ORF start site = 4922 
ORF end site = 3906 
ORF sequence: 

30 MLAVRSLQHLTTVI FITAYGLVLAWYIVFGAS PLHRCI YAVRPAGAHNDTALVWMKINQT 
LLFLGPPTAP PGGAWTPHAHVCYANI I EGRAVSLPAI PGAMSRRVMNVHEAVNCLEALWD 
TQMRLVWGWFL YLAFVALHQRRCMFGWS PAHSMVAPAT YLLNYAGRI VS S VFLQYPYT 
KITRLLCELSVQRQTLVQLFEADPVTFLYHRPAVGVIVGCELLLRFVALGLIVGTALISR 
GACAITYPLFLTITTWCFVSIIALTELYFILRRDSAPKNAEPAAPRGRSKGWSGVCGRCC 

35 SI ILSGIAVRLCYIAWAGWLMALRYEQEIQRRLFDL* 

Gene matched: gi | 116105 | sp| P2 2485 |CELF_HSV2H 
Gene name: CELL FUSION PROTEIN PRECURSO 

40 

[SEQ ID NO: 103] 
ORF # = 5 from Contig 98 
ORF start site = 6334 
45 ORF end site = 4874 
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ORF sequence: 

AAFDLEVroHRPFAPGPALPPGGI^VGGHMYVNRNEIFNAAIjAVTNIILDLDIALKEPVP 
FPRLHEALGHFMRGALAAVXLLFPAARVNPDAYPCYFFKSACRPRAPPVCAGDGPSAGGD 
IX3DGDWFPDAGGDIX3DEEWEEDTDPMDTTHGPLPDDEAAYLDLLHEQIPAATPSEPDSVV 
5 CSCADKIGLRVCLPVPAPYVVHGSLTMRGVARVIQQAVLLDRNFVEAVGSHVKNFLLIDT 
GVYAHGHSLRLPYFAKIGPDGSACGRLLPVFVIPPACEDVPAFVAAHADPRRFHFHAPPM 
FSAAPREIRVLHSLGGDYVSFFEKKASRNALEHFGRRETLTEVLGRYDVRPDAGETVEGF 
ASELLGRIVACIEAHFPEHAREYQAVSVRRAVIKDDWVLLQLIPGRGALNQSLSCLRFKH 
GRASRATARTFLALSVGTNNRLCASLCQQCFATKCDNNRLHTLFTVDAGTPCSRSAPSST 
10 SRPSSS* 



Gene matched: gi | 136939 | sp| P10236 |UL52_HSV11 
Gene name: DNA REPLICATION PROTEIN UL52 

15 

[SEQ ID NO: 104) = Contig ID 99 

20 [SEQ ID NO: 105] 

ORF # = 1 from Contig 99 
ORF start site =213 
ORF end site = 659 
ORF sequence: 

25 VGVGVRGWGGGXCGGLWGGWCWVXWGGWGVFFCFFLFCXXFXXXXXXXFLAPDLTDPL 
LFAYVG FQWNHGLMFWPDI AVYAMLGGAVWI SLTQVLGLRRRLHKDPDAGPWAAATLR 
GLFFSVYALGFAAGVLVRPRMAASRRSG * 

30 Gene matched: gi | 807644 

Gene name: (M10053) unknown protein [Herpes simplex virus ty 

[SEQ ID NO: 106] 
ORF # = 2 from Contig 99 
35 ORF start site = 757 
ORF end site = 2403 
ORF sequence: 

MGAGVPWTGI KARGAGGPITVR VLGWEVAQKATH PCCSC PREAWSGNPPRCAGRAHRS F 
AGAGALLV^SALGRVGLAVGLWGLLWG VVVVIiANAS PGRTITVGPRGKESNAAPSAS PRN 

40 ASAPRTTPTPPQPRKATKSKASTAKPAPPPKTGPPKTSSEPVRCNRHNPLARYGLRVQIR 
CRFPNSTRTESRLQIWRYATATDAEIGTAPSLEEVMVNVSAPPGGQLVYDSAPNRTDPHV 
I WAEGAGPGAS PRLYSWGPLGRQRLI I EELTLETQGMYYWVWGRTDRPSAYGTWVRVRV 
FRPPSLTIHPHAVLEGQPFKATCTAATYYPGNRAEFVWFEDGRRVFDPAQIHTQTQENPD 
GFSTOSTVTSAAVGGQGPPRTFTCQLTWHRDSVSFSRRNASGTASVLPRPTITMEFTGDH 

45 AVCTAGC VPEGVTFAWFLGDDS S PAEKVAVASQTSCGRPGTAT I RSTL PVS YEQTE YI CR 
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LAGYPDGIPVLEHHGSHQPPPRDPTERQVIRAVEGAGIGVAVLVAWLAGTAWYLTHAS 
SVRYRRLR* 

5 Gene matched: gi | 138220 | sp | P0647 5 |VGLC_HSV23 
Gene name: GLYCOPROTEIN C PRECURSOR, gi 



10 

[SEQ ID NO: 107] 
ORF # = 3 from Contig 99 
ORF start site = 2634 
ORF end site = 3152 
15 ORF sequence: 

MAFRASG PAYQ PLAPAAS PARARVPAVAWIGVGAI VGAFALVAALVLVPPRS SWGLS PCD 
SGWQEFNAGCVAWDPTPVEHEQAVGGCSAPATL I PRAAAKHLAALTRVQAERS SGYWWVN 
GDG I RTCLRLVDS VSG I DEFC EELAI RICYYPRS PGGFVRFVTS I RNALGL P * 

20 

Gene matched: gi | 136917 | sp | P06483 |UL45_HSV23 
Gene name: PROTEIN UL4 5 HOMOLOG (18 KD 

25 [SEQ ID NO: 108] 

ORF # = 4 from Contig 99 
. ORF start site = 4072 
ORF end site = 3419 
ORF sequence: 

30 MAGAPPRLPPRNPAPPEQRPAAAARPLAAHREAAGVYNAVRTWGPDAEAEPDQMENTYLL . 
PEDDAAMPAGVGLGSTPAADTTAAAWPAESHAPRAPSEEADSIYESVSEDGGRVYEEIPW 
VRVYENICLRRQDAGGAAPPGDAPDSPYIEAENPLYDWGGSALFSPPGATRAPDPGLSLS 
PMPAR PRTNALANDG PTNVAAL S ALLTKLKRGRHQSH * 

35 

Gene matched: gi | 114350 | sp|P10230 |ATI2_HSV11 
Gene name: ALPHA TRANS- INDUCING FACTOR 

40 [SEQ ID NO: 109) 

ORF # = 5 from Contig 99 
ORF start site = 5584 
ORF end site = 4391 
ORF sequence: 

45 MQRRARGAS SLRLARCLTPANL IRGANAGVPERRI FAGCLLPT PEGLLSAAVGVLRQRAD 
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DLQPAFLTGADRSVRIJ\ARHHNTVPESLIVTOLASDPHYDYIRHYASAAKOALGEVELSG 
GQLSRAILAQYWKYLQTWPSGLDIPDDPAGXCDPSLHVLMRPTLLPKLVVRAPFKSGAA 
AAKYAAAVAGLRDAAHRLQQYMFF^PADPSRPSTDTALRLSELLAYVSVLYHWASWMLW 
TADKYVCRRLGPADRRFVALSGSLEAPAETFARHLDRGPSGTTGSMQCMALRAAVSDVLG 
5 HLTRLAHLWETGKRSGGTYGIVDAIVSTVEVLSIVHHHAQYI INATLTGYVVWASDSLNN 
EYLRAAVDSQERFCRTAAPLFPTMTAPSWARMELSIK* 

Gene matched: gi | 114351 | sp| P08314 |ATI2_HSV1F 
10 Gene name: ALPHA TRANS -INDUCING FACTOR 



[SEQ ID NO: 110) 
ORF # = 6 from Con tig 99 
15 ORF start site = 7758 
ORF end site = 5668 
ORF sequence: 

MSVRGHAVRRRRASTRSHAPSAHRAESPVEDEPEGGGVGLMGYLRAVFNVDDDSEVEAAG 
EMASEEPPPRRRREARGHPGSRRASEARAAAPPRRASFPRPRSVTARSQSVRGRRDSAIT 

20 RAPRGGYLGPMDPRDVLGRVGGSRWPSPLFLDELSYEEDDYPAAVAHDDGAGARPPATV 
EIIEGRVSGPELQAAFPLDRLTPRVAAWDESVRSALALGHPAGFYPCPDSAFGLSRVGVM 
HFASPADPKVFFRQTLQQGEALAWYVTGDAILDLTDRRAKTSPSRAMGFLVDAIVRVAIN 
GWVCGTRLHTEGRGSELDDRAAELRRQFASLTALRPVGAAAVPLLSAGGAAPPHPGPDAA 
VFRS S LGSLLYWPGVRALLGRDCRVAARYAGRMT YI ATGALLARFNPGAVKCVLPREAAF 

25 AGRVLDVLAVIAEQTVQWLSVVVGARLH PHSAHPAFADVEQEALFRALPLGS PGWAAEH 
EAI^DTAARRLIATSGLNAVLGAAOTALHTALATOT 

ATGL I LQRLLGLADTWAC VALAAFDGGSTAPEVGTYT PLRYACVLRATQPLYARTTPAK 
FWADVRAAAEHVDLRPASSAPRAPVSGTADPAFLLEDLAAFPPAPLNSESVLGPRVRVVD 
IMAQFRKLLMGDEETAALRAHVSGRRATGLGGPPRP* 

30 

Gene matched: gi | 136920 | sp| P10231 |UL47_HSV11 
Gene name: VIRION PROTEIN UL47 (82/81 K 

35 

[SEQ ID NO: 111] 
ORF # = 7 from Contig 99 
40 ORF start site « 9949 
ORF end site = 8279 
ORF sequence: 

VILKMRGGGREMSVIGDARHPRQFPSQGPRPFSVAGPGSLPPSPPPGARALLIRLSKSLS 
PDPTAPMDLLVNNLFADADGVSPPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPV 
45 PPAALFNRLLDDLGFSAGPALCTMLDTWNEDLFSGFPTNADMYRECKFLSTLPSDVIDWG 
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DAHVPERSPIDIRAHGDVAFPTLPATRDELPSYYEAMAQFFRGELRAREESYRTVLANFC 
SALYRYLRASVRQLHRQAHmGRNRDLREMLRTTIADRYYRETARLARVLFLHLYLFLS^ 
EILWAAYAEQMMRPDLFDGLCCDLESWRQIACLFQPLMFINGSLTVRGVPVEARRLRELN 
HIREHLNLPLVRSAAAEEPGAPLTTPPVLQGNQARSSGYFMLLIRAKLDSYSSVATSEGE 
5 SVMREHAYSRGRTRNNYGST I EGLLDLPDDDDAPAEAGLVAPRMS FLSAGQRPRRLSTTA 
PITDVSLGDELRLDGEEVDMTPADALDDFDLEMIX3DVESPSPGMTHDPVLYGALDVDDFE 
FEQMFTDAMGI DDFGG * 

10 Gene matched: gi | 1168549 | sp| P29793 |ATIN_HSV23 
Gene name: ALPHA TRANS -INDUCING PROTEI 



15 TABLE 2 



[SEQ ID NO: 112] = Contig ID 1 . 

20 

[SEQ ID NO:113] >contigl (start 332 - stop 874) 

MRTPADDVSWRYEAPSVI DYARI DGI FLRYHCPGLDT FLWDRHAQRAYLVNPFLFAAGFLEDLSH S VF 
PA 

DTQETTTRRALYKEIRDALGSRKQAVSHAPVRAGCVNFDYSRTRRCVGRRDLRPANTTSTWEPPVSSD 
25 DE 

AS SQSKPLATQ P PVLALSNAP PRR VS PTRGRRRHTRLRRN * 

gi| 136776 |sp|P28278|VGLL_HSV2H GLYCOPROTEIN L PRECURSOR 

30 [SEQ ID NO: 114 J >contigl (start 747 - stop 1751) 

MKRARSRSPSPPSRPSSPFRTPPHGGSPRREVGAGILASDATSHVCIASHPGSGAGYPTRLAAGSAVQ 
RR 

RPRGC PPGVMFS ASTT PEQPLGLSGDATPPL PTS VPLDWAAFRRAFLI DDAWR PLLE PELANPLTARL 
LA 

35 EYT)RRCQTEEVLPPREDWSWTRYCTPDDVRWIIGQDPYHHPGQAHGLAFSVRADV 
AV 

KNCYPDARMSGRGCLEKWARDGVLLLNTTLTWRGAAASHSKIJ3WDRFVGGW 
GA 

HAQNAIRPDPRQHYVLKFSHPSPLSKVPFGTCQHFLAANRYLETRDIMPIDWSV* 

40 

gi| 137039 |sp|P28275[UNG_HSV2H URACIL-DNA GLYCOSYLASE 

[SEQ ID NO: 115] >contigl (start 1806 - stop 2507) 

MVKSRVSYRSVMSGVGEERVPSAFTILASWGWTFAPQNHDPGASPNTTPIESIAGTAPDAHVGPLDGE 
45 PD 
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RDAISPLTSSVAGDPPGADGPYWFDTLFMVSSIDELGRRQLTOTIRKDLRLSLAKFSIACTKTSSFS 
GT 

AARQRKRGAPPQRTCVPRSNKSLQMFVLCKRANAAQVREQLRAVIRSRKP^ 
FV 

5 HEFVSSEPMRLHRDNVMLSTEPD* 

gi|l36782|sp|P28279|UL03_HSV2H PROTEIN UL3 

[SEQ ID NO: 116] >contigl (start 3312 - stop2707) 

1 0 MGNPQTTIAYSLHHPRASLTSALPDAAQWHVFESGTRAVLTRGRARQDRLPRGGWIQHTPIGLLVI 
ID 

CRAEFCAYRFIGRASTQRLERWWDAHMYAYPFDSWSSSHGESVRSATAGILTVVWTPDTIYITATIY 
GT 

APEAARGCDNAPLDVRPTTPPAPVS PTAGEFPANTTDLLVEVLREI Q I S PTLDDADPTPGT * 

15 

gi| 136788 |sp|P28280|UL04_HSV2H PROTEIN UL4 

[SEQ ID NO:117) >contigl {contigl start 6024 - stop 3379) 

MAASGGEGSRDVRAPGPPPQQPGARPAVRFRDEAFLNFTSMHGVQPIIARIRELSQQQLDVTQVPRLQ 
20 WF 

RDVAALEVPTGLPLREFPFAAYLITGNAGSGKSTCVQTLNEVLDCWTGATRIAAQNMYVKLSGAFLS 
RP 

INTIFHEFGFRGNHVQAQLGQHPYTLASSPASLEDLQRRDLTYYWEVILDITKRALAAHGGEDARNEF 
HA 

25 LTALEQTLGLGQGALTRIASVTHGALPAFTRSNIIVIDEAGLIX5RHLLTTVVYCWWMINALYHTPQYA 
GR 

LRPVLVCVGSPTQTASLESTFEHQKLRCSVRQSENVLTYLICNRTLREYTRLSHSWAIFINNKRCVEH 
EF 

gnlmkvleyglpiteehmqfvdrfvvpesyitnpanlpgwtrlfsshkevsaymaklhay^ 
30 fv 

VFTLPVLTFVSVKEFDEYRRLTQQPTLTMEKWITANASRITNYSQSQDQDAGHTOCEVHSKQQLW^ 
ND 

ITYVLNSQVAVTARLRKMVFGFDGTFRTFEAVLRDDSFWTQGETSVEFAYRFLSRI^ 
LQ 

35 RPGLDATQRTIAYGRLGELTAELLSLRRDAAGASATRAADTSDRSPGERAFNFKHLGPRDGGPDDFPD 
DD 

LDVIFAGLDEQQLDVFYCHYALEEPETTAAVHAQFGLLKRAFLGRYLILRELFGEVFESAPFSTYVDN 
VI 

. FRGCELLTGS PRGGLMS VALQTDNYTLMG YTYTRVFAFAEELRRRHATAGVAEFLEES PLPYI VLRDQ 
40 HG 

FMSVVNTNISEFVESIDSTEIAMAINADYGISSKLAMTITRSQGLSLDKVAICFTPGNLRLNSAYVA^ 
SR 

TTSSEFLHMNLNPLRERHERDDVISEHILSALRDPNWIVY* 
45 gi|74000|pir| |WMBEU5 gene UL5 protein - human herpesvirus 1 
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[SEQ ID NO: 118] >contigl (start 5594 - stop 7375) translated 

MVLMGRLRNAPESLTYMFCAAI RVAPVTTQSRTSLRVC THVLF PDPALPVMRYAANGNSRSGR PVGTS 
KA 

5 ATSRNHCRRGTCVTSSCCCESSRMRAMIGWTPCMDVKF^ 
PP 

DAAMAAQRARAPAmTRGGDAALCAPEIXSWVKVHPTPGTMLFREILLGQMGYT 
TR 

QLQAAIFHALLNATTYRDLEEDWRRHWARGLQPQRLWRYRNAREGDIAGVAERVFimJRCTLRTTL 
10 LD 

FAHGVVTKTFAPGGPSGPTSFPKYIDWLTCLGLVPILRKTREGE^TQRLGAFLRQHTLPRQLATVAGAA 
ER 

AGPGLLEIAVAFDSTRMAEYDRVHI YY1WRRGEWLVRDPVSGQRGECLVLC PPLWTGDRLVFDS PVQR 
LC 

1 5 PEI VACHALREHAHICRLRNTASVKVLLGRKS DS ERGVAGAARWNKALGEDDETKAGS AASRLVRL I 
IN 

MKGMRHVGDINDTVRAYLDEAGGHLIDTPAVDHTLPGFGKGGTGRGSAAQDPGARPQQLRQAFQTAW 
NN 

INGMLEGYINNLFGTI ERLRETNAGLATQLQARV 

20 

gi| 136794 |sp|Pl0190|UL06_HSVll VIRION PROTEIN UL6 



25 [SEQ ID NO: 119] = Contig ID 10 Length: 21036 Type: N Check: 
7835 

[SEQ ID NO: 120] >contigl0 (start 5688 - stop 1) translated 

VAGAAHMIPAALPHPTMKRQGDRDIVVTGVRNQFATDLEPGGSVSCMRSSLSFLSLLFDVGPRDVLSA 
30 EA 

IEGCLVEGGEWTRAAAGSGPPRMCSIIELPNFLEYPAARGGLRCVFSRVYGEVGFFGEPTAGLLETQC 
PA 

HTFFAGPWAMRPLSYTLLTIGPLGMGLYRDGDTAYLFDPHGLPAGTPAFIAKVRAGDVYPYLTYYAHD 
RP 

35 KVRWAGAMVFFVPSGPGAVAPADLTAAALHLYGASETOLQDEPFVERRVAITHPLRGEIGGLGALFV^ 
W 

PRGDGEGSGPWPALPAPTHVQTPRADRP PEAPRGASGPPNTPQAGH PNRPPDDVWAAALEGT ppakp 
SA 

PDAAASGPPHAAPPPQTPAGDAAEEAEDLRVLEVGAVPVGRHRARYSTGLPKRRRPTWTPPSSVEDLT 

40 sg 

erpapkappakakkksapkkkapvaaevpassptpiaatvppapdtppqsgqgggddgpaspsspsvl 

ET 

LGARR P P E P PG ADLAQL F EVH PNVAATA VRLAARDAALAREVAAC SQLT I N ALRS P YP AH PG LLELC V 
IF 
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FFERVLAFLI ENGARTHTQAGVAG PAAALLDFTLRMLPRKTAVGDFLASTRMSLADVAAHRPL I QHVL 
DE 

NSQ I GRLALAKLVLVARDVIRETDAFYG DLADLDLQLRAAPPANLYARLGEWLLERSRAH PNTLFAPA 
TP 

5 THPEPLLHRIQALAQFARGEEMRVEAEAREMREALDAIjARGVDSVSQRAGPLTVMPVPAAPGAGGRAP 
CP 

PALGPEAIQARLEDVRI QARRAI ESAI KEYFHRGAVYSAKALQASDSHDCRFHVAS AAWPMVQLLES 
LP 

AFDQHTRDVAQRAALPPPPPLATS PQAI LLRDLLQRGQTLDAPEDLAAWLS VLTDAATQGLI ERKPLE 
10 EL 

ARS IHGINDQQARRSSGLAELQRFDALDAALAQQLDSDAAFVPATGPAPYVDGGGLS PEATRMAEDAL 
RQ 

ARAMEAAKMTAELAPEARSRLRERAHALEAMLNDARERAKVAHDAREKFLH KLQGVLRPL PDFVGLKA 
CP 

1 5 AVLATLRASLPAGOTDLADAVRGPPPEVTAALRADLWGLLGQYREALEHPTPDTATALAGLH 
LK 

TLFADAPETPVLVQFFSDHAPTIAKAVSNAINAGSAAVATASPAATVDAAVRAHGALADAVSALGAAA 
RD 

PASPLSFIJVALADSAAGYVKATRLALEARGAIDELTTLGSAAAD^ 
20 RA 

TTAARESLAGHEAGFGGLLHAEGTAGDHSPSGRALQELGKVIGATRRRADELEAAVADLTAKMAAQRA 
RG 

SS ERWAAGVEAALDRVENRAE FDVVELRRLQALAGTHGYNPRDFRKRAEQALAANAEAVTLALDTAFA 
FN 

25 PYTPENQRHPMLPPLAAIHRLGWSAAFHAAAETYADMFRVDAEPLARLLRIAEGLLEMAQAGIX5 
HE 

AVGRLADDMTS VPGLRRYVPFFQHGYADYVELRDRLDAI RADVHRALGGVPLDLAAAAEQ I SAARNDP 
EA 

TAELVRTGVTLPC P SEDALVACAAALERVDQS PVKNTAYAEYVAFVTRQDTAETKDAVVRAKQQRAEA 
30 TE 

RVMAGLREALAARERRAQI EAEGLANLKTMLKWAVPAWAKTLDQARSVAEI ADQVEVLLDQTEKTR 
EL 

DVPAVIWLEHAQRTFETHPLSAARGIX3PGPLARHAGRIX3ALFDTRRR 
FG 

35 RVRGGAWKSPEGFTIAMHEQLRALQOTTNWSGLRAQ 
TK 

HTALCARLRDEVVRRVPWEMNFDALGRLIAEFDAAAADLAPWAVEEFRGARELIQYRMGLYSAYA 
GQ 

TXXXXX 

40 

gi| 135576 |sp|Pl0220|TEGU_HSVll LARGE TEGUMENT PROTEIN 

[SEQ ID NO: 121] >contiglO (start 9322 - stop 5978) translated 
MSDSALQVPAPAGMTPPSAPPPNGPLQVLLGSLTNLRRPPSPSSEPAGSADEPAFLSAAKLRAATAAF 
45 LL 
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SGAAVGPAEARACWH PLLEQLC ALHRAHGL PETALLAENLPGLLVHRMAVAL PETPEAAFREMDVI KD 
TV 

LAITGSDTTHALEAAGLRTTAALGPVRWQCAVEWIDRWRTVTQSCLAMNPR^ 
PL 

5 GQPGANLTTPAYSLLFPSPIVQEGLRFLALVSNWVTLFSAHLQRIDDAALTPLTRALFTLALVDEYLT 
TP 

DRGAWPPPLLAQFQHTVREIDPAIMIPPLEATKMVRSREEVRVSTALSRVSPRSACAPPGTLMARVR 
TD 

AAWDPDVPFLSASALAIFRPAVTGLLQLGEPPSAGAQQRLLALLQQTWALVQNSNSPSVVINTLTDA 
10 GF 

TPAHCTQYI SALEGFLVAGVPARTPPGHGLS E IQQLFGC I ALAGANVFGLARE YGHYAGYVKTFRRI Q 
GA 

SEHTHGRLCEAVGLSGGVLSQTLARIMGPAVPTEHLASLRRTLVGEFETAERRFSAGQPSLLRETALI 
WL 

1 5 DWGQTHWDLTPTTPATPLSALLPVGPPSHAPSVHIAAATK^ 

VG 

DALRATCNAAYLPRP IEFALRVLAWARDFGLG YL PTVEGHRTKLGAL I TLLEPATRAGVGPTMQMADN 

QLLRELYVIARGAVEQLRPAVQLPPPQPPEVGSSLLLISMYALAARGVLQELAERADPLVRQLEDAIV 
20 LL 

RLHmTIJ^FFECRFESDGHRLYAWADAHERLGPVmPEAMGDAVSQYCGMYHDAKRALVASLAGLRS 
W 

TETTAHLGVCDELAAQVSHEGNVLAVVRREI HGFLAI VSG I HARAS KLMSGDQVPGFCYMS QFLARWR 
RL 

25 SAGYQAAI^TGPERVAEFVQELHDTV^GLQTERALWAPFASSADQRTAAIQEVMAHATEDAPPSPA 
AD 

LVVLTNRHDLGAWGDYSLGPLGQPTWPDSVDLSPQGLAATLSMDWLLINELLQVTDGVFRASAFR 
AG 

PEAPGDLEAQDAGGSTPEPTTPGPQDTQARAPSTRPAGRETVPWPNTPVEDDEMTPQETPPVHP* 

30 

gi | 221758 UL37 [Herpes simplex virus type 1] 

[SEQ ID NO: 122] >contigl0 (start 9262 - stop 11211) translated 

VERTGGSCRRAPGPGARC PTWRPACALGDAARRPRAQTGMTAAALYGGAKYRPGTLRNPGRVAST PRR 
35 RG 

VLYGALCPGIPFVGSGPGAVGWECVCVGGGRRDGGPDQVYRGRSVGRPNRPFKHLRMHRPSQSDTGTH 
QR 

RKPPSFVl*vT*VFSGGWFLSALLPPHLHHPPPTTRPIAI GGCT 
EL 

40 AGHAPLRRVIiRPPIARRDGPVLLGDRAPRRTASTMWLLGIDPAESSPGTRATRDDTEQATOKILRGAR 
RA 

GGLTWGAPRYHLTRQWLTDLCQPNAERAGALLIJVLRHPTDLPHLARHRAPPGRQTE 
EA 

SALGSGRAESGCARAGLVS FNFLVAACAAAYDARDAAEAVRAHITTNYGGTRAGARLDRFSECLRAMV 
45 HT 
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HWPHEVMRFFGGLVSWVTQDELASOTATC 
GA 

AFLYLWTYRQCRDQEIXCVYVVKSQLPPRGLEAALERL^ 
AV 

5 LAAS SQS PRCSAS QVTNPQFVDRLYRWQPDLRGRPTARTCTYAAFAELGVMPDNSPRCLHRTERFGAV 
GV 

PW I L EGWWR PGGWRAC A * 

gi| 139176 |sp|P22486|VPl9_HSV2G CAPSID ASSEMBLY AND DNA MATU 

10 

[SEQ ID NO: 123 3 >contigl0 (start 11673 - stop 15215) translated 

VIRRPVRPFGRTAHPASHGPAAVSVHRVRATVTLVPMANRPAASALAGARSPSERQEPREPEVAPPGG 
DH 

VFCRKVSGVMVLSSDPPGPAAYRISDSSFVQCGSNCSMI IDGDVARGHLRDLEGATSTGAFVAI SNYA 
15 AG 

GDGRTAWALGGTSGPSATTSVGTQTSGEFLHGNPRTPEPQGPQAVPPPPPPPFPWGHECCARRDARG 
GA 

EKDVGAAESWSDGPSSDSETEDSDSSDEDTGSGSETLSRSSSIWAAGATDDDDSDSDSRSDDSVQPDV 
W 

20 RRRWSDGPAPVAFPKPRRPGDSPGNPGLGAGTGPGSATDPRASADSDSAAHAAAPQADVAPVLDSQPT 
VG 

TDPGYPVPLELTPENAEAVARFLGDAVDREPALMLEYFCRCAREESKRVPPRTFGSAPRLTEDDFGLL 
NY 

ALAEMRRLCLDLP PVP PNA YTPYHLRE YATRLVNG FK PLVRRSARLYRI LGI LVHLR I RTREAS FEEW 
25 MR 

SKEVDLDFGLTERLREHEAQLMILAQALNPYDCLIHSTPNTLVERGLQSALKYEEFYLKRFGGHYMES 
VF 

QMYTRIAGFLACRATRGMRHIALGRQGSWWEMFKFFFHRLYDHQIVPSTPAMLNLGTRNYYTSSCYLV 
NP 

30 QATTNQATLRAITGNVSAILARNGGIGLCMQAFNDASPGTASIMPALKVLDSLVAAHNKQSTRPTGAC 
VY 

LEPWHSDVRAVLRMKGVIAGEEAQRCDNIFSALVMPDLFFKRLIRHLTC 
FH 

GEEFEKLYEHLEAMGFGETIPIQDIAYAIVRSAATTGSPFIMFKDAVNRHYIYDTQGAAIAGSNLCTE 
35 IV 

HPSSKRS SG VCNLGSWLARCVSRRTFDFGMLRDAVQACVLMVNIMI DSTLQPTPQCARGHDNLRSMG 
IG 

MQGLHTACLKMGLDLESAEFRDLNTHIAEVMLLAAMKTSNALCTOGARPFSHFKRSMYRAGRFHWERF 
SN 

40 ASPRYEGEWEMLRQSMMKHGLRNSQFIALMPTAASAQISDVSEGFAPLFTNLFSKVTRDGETLRPNTL 
LL 

KELERTFGGKRLLDAMDGLEAKQWSVAQALPCLDPAHPLRRFKTAFDYDQELLIDLCADRAPYVDHSQ 
SM 

TLYVTEKADGTLPASTLVRLLVHAYKRGLKTGMYYCKVRKATNSGVFAGDDNIVCTSCAL* 



80 



WO 98/20016 



PCTYUS97/20016 



gi ) 1710385 | sp| P09853 | RIR1_HSV23 RIBONUCLEOS IDE -DIPHOSPHATE 

[SEQ ID NO: 124 ] >contigl0 (start 15268 - stop 16281) translated 
MDPAVSPASTDPLDTHASGAGAAPIPVCPTPERYFYTSQCPDINHLRSLSILNRWLETELVFVGDEED 
5 VS 

KLSEGELGFYRFLFAFLSAADDLVTENLGGLSGLFEQKDILHY 
ND 

QARRAYVARTINH PAI RVKVDWLEARVRECDS I PEKFILMI LI EGVFFAASFAAI AYLRTNNLLRVTC 
QS 

10 NDLISRDEAVHTTASCYIYNNYUSGHAKPEAARVYRLFRE^^ 
EN 

YVRFSADRLLGLIHMQPLYSAPAPDASFPLSLMSTDKHTNFFECRSTSYAGAWNDL* 

gi | 132624 | sp | P03174 | RIR2_HSV23 RIBONUCLEOS IDE-DI PHOSPHATE R 

15 

[SEQ ID NO: 125) >contigl0 (start 17637 - stop 16564) translated 

MRRRGHAFAPGDRGTRAAGPGPAAPWGAPSKPALRLAHLFCIRVLRALGYAYINSGQLEADDACANLY 
HT 

NTVAYVHTTDTDLLLMGCDIVLDISTGYIPTIHCRDLLQYFKMSYPQFLALFVRCHTDLHPNNTYASV 
20 ED 

VLRECHWTAPSRSQARRAARRERANSRSLESMPTLTAAPVGLETRISWTEILAQQIAGEDDYEEDPPL 
QP 

PDVAGG PRDGARS S SS EILTPPELVQVPNAQ RVAEHRGYVAGRRRHV I HDAPEALDWL PDPMT I AELV 
EH 

25 RYVKWISLISPKERGPOTLLKRLPIYQDLRDEDLARSIVTRHITAPDIADRFLAQLWAHAPPPAFYK 
DV 

LAKFWDE* 

gi|549322|sp|P36699|VHS_HSV2G VIRION HOST SHUTOFF PROTEIN 

30 

[SEQ ID NO: 126] >contigl0 (start 18537 - stop 19949) translated 

MAHLPGGAAAAPLSEDAIPSPRERTEDWPPCQIVLQGAELNGILQAFAPLRTSLLDSLLWGDRGILV 
HN 

AIFGEQWLPLDHSQFSRYRWGGPTAAFLSLVDQKRSLLSVFRANQYPDLRRVELTVTGQAPFRTLVQ 
35 RI 

WTTASDGEAVELASETLMKRELTS FAVLLPQGDPDVQLRLTKPQLTKWNAVGDETAK PTTFELGPNG 
KF 

SVFNARTCVTFAAREEGASSSTSAQVQILTSALKKAGQAAANAKTVYGENTHRTFSVVVDDCSMRAVL 
RR 

40 LQVGGGTLKFFLTADVPSVCVTATGPNAVSAVFLLKPQRVCLNW^ 
QD 

FSSEPDAGDRGAPEEEGLEGQARVPPAFPEPPGTKRRHAGAEWPADDATKRPKTGVPAAPTRAESPP 
LS 

ARYGPEAAEGGGDGGRYACYFRDLQTGDASPSPLSAFRGPQRPPYGFGLP* 

45 
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gi|l36905|sp|P10226|VPAP_HSVll DNA POLYMERASE PROCESSIVITY 

[SEQ ID NO:127] >contigl0 (start 20031 - stop 21053) translated 

VCPPPPTNMAWCGSGLRLRPFHPPSPSFFVLRALIRAGPGPFAASPRAPSGPGCGMCRGDSPGVAGG 
5 SG 

EHCLGGDDGDDGRPRIAOTGAIARGFAHLWLQATTLGFVGSWLSRGPYADAMSG 
AP 

PAFARPPTRVCAV^RLVGGGAAVALWSLGEAGAPPGVPGPATQCLALGAAYA^ 
PR 

10 PLFVGTLGVWGGLTIGGSARYWWIDPRAAAALTAAWAGLGTTAAGDSFSKACPRHRRFCWSAVES 
PP 

PRYAPEDAERPTDHGPLLPSTHHQRSPRVCGIX3AARPENIWPVVTFAGAIALAACAARGS 
gi|l36909|sp|Pl0227|UL43_HSVll MEMBRANE PROTEIN UL43 

15 

[SEQ ID NO : 128 ] = Contig 11 Length: 2343 Type: N Check: 6656. 

[SEQ ID NO: 129] >contigll {start 2357 - stop 3) translated 

APLLVDLRALDARARASSSPEGHEVDPQLLRRRGEAYLRAGGDPGPLVLREAVSALDLPFATSFLAPD 
20 GT 

PLQYALC FPAVTDKLGALLMRPEAACVR P PLPTDVLES APTVTAMYVLTVVNRLQLALSDAQAANFQL 
FG 

RFVRHRQATWGASMDAAAELYVALVATTLTREFGCRWAQLGWASGAAAPRPPPGPRGSQRHCVAFNEN 
DV 

25 LVALVAGVPEHIYNFWRLDLVRQHEYMHLTLERAFEDAAESMLFVQRLTPHPDARIRVLPTFLDGGPP 
TR 

GLLFGTRLADWRRGKLSETDPIAPWRSALEIXSTQRRDAPALGKLSPAQAl^VS^GRMCLPSAAI^ 
LW 

TCMFPDDYTEYDSFDALLAARLESGQTLGPAGGREASLPEAPHALY11PTGQHVAVLAAATH 
30 TA 

MDLVLAAVLIA3APVWALRNTTAFSRESELELCLTLFDSRPGGPDAALRDWS 
NP 

IENACLAAQLPRLSALIAERPLADGPPCLVLVDISMTPVAVLWEAPEPPGPPDVRFVGSEATEELPFV 
AT 

35 AGDVLAASAADADPFFARAILGRPFDASLLTGELFPGHPVYQRPLADEAGPSAPTAARDPRDLAG^ 
GS 

GPEDPAAPPARQADPGVIAPTLLTDATTGEPVPPRMWAWIHGLEELASDDAGGPTPNPAPALLPPPAT 
DQ 

SVPTSQYAPRPIGPAATARETRPSVPPQQNTGRVPVAPRDDPRPSPPTPSPPADAALPPPAFSGSAAA 

40 fs 

aavprvrrsrxxxxx 

gi| 135576 |sp|P10220|TEGU_HSVll LARGE TEGUMENT PROTEIN 

45 
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[SEQ ID NO: 130] = Contig 12 Length: 14928 Type: N Check: 1371 



ISEQ ID NO: 131 3 >contigl2 (start 1505 - stop 3) translated 

5 MAAAPPAAVSEPTAARQKLLALIX^VQT 
LE 

AQAAAFLTPLSVTLELLLEYAWREGERLLGHLETFATTGDVSAFFTETMGLARPCPYHQQIRLETYGG 
DV 

RMELCFLHDVENFLKQLNYCHLITPPSGATAALERVREFMVAAVGSGLIVPPELSDPSHPCAVCFEEL 

10 cv 

TANQGATIARRLADRI CNHVTQQAQVRLDANELRRYLPHAAGLS DAARARALCVLDQALARTAAGGGA 
RA 

GPPPADSSSTOEEADALLEAHDOTQATTPGLYAISEIJIFWIASGDRARHS™^ 
QE 

15 TAAVAVELALFGRRAEHFDRAFGGHLAALDMVDAL»I IGGQATS PDDQIEALIRACYDHHLTTPLLRRL 
VS 

PEQCDEEALRRVLARLGAGGATGGAEEEEPRAAAEEGGRRRGAGTPASEDGERGPEPGAQGPESWGDI 
AT 

RAAADVXXXXX 

20 

gi| 124088 |sp|Pl0212|PRTP_HSVlll PROCESSING AND TRANSPORT PRO 

[SEQ ID NO: 132] >contig!2 (start 5468 - stop 1878) translated 
MDTKPKTTTTVKVPPGPMGYVYGRACPAEGLELLSLLSARSGDADVAVAPLIVGLTVESGFEANVAAV 
25 VG 

SRTTGLGGTAVSLKLMPSHYSPSVYVFHGGRHLAPSTQAPNLTRLCERARRHFGFSDYAPRPCDLKHE 
TT 

GDALCERLGLDPDRALLYLVITEGFREAVCISNTFLHIX3GMDKWIGDAEVHRIPVYPLQMFMPDFSR 
VI 

30 ADPFNCNHRS IGENFNYPLPFFNRPLARLLFEAWGPAAVALRARNVDAVARAAAHLAFDENHEGAAL 
PA 

DITFTAFEASQGKPQRGARDAGNKGPAGGFEQRLASVMAGDAALALES I VSMAVFDEPP PD I TTWPLL 
EG 

QETPAARAGAVC^YLARAAGLVGAMWSTNSALHLTEVDDAGPADPKDHSKPSFYRFFLVPGTHVA^ 

35 pq 

LDREGHWPGYEGRPTAPLVGGTQEFAGEHI^LCGFSPALLAKMLFYLERCDGGVIVGRQEMDVFRY 
VA 

DSGQTDVPCNLCTFETRHAC AHTTLMRLRARH PKFAS AARGAI GVFGTMN S AYSDCDVLGNYAAFS AL 
KR 

40 ADGSENTRTIMQETYRAATERVMAELEALQYVDQAVPTALGRLETIIGTREALHTVVNNIKQLVDREV 
EQ 

LMRNLIEGRNFKFRDGLAEANHAMSLSLDPYTCGPCPLLQLLAR^ 
EG 

RNFRNQFQPVIJUmVMDLFNNGFLSAKTLTVALSEGAAICAPSLTAGQTAPAESSFEGDVARVTLGFP 
45 KE 
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LRVKSRVLFAGASANASEAAKARVASLQSAYQKPDKRVDILLGPLGFLLKQFHAVIFPNGKPPGSNQP 
NP 

QWFWALQRNQLPARLLSREDIETIAFIKRFSLDYGAINFINIAPNWSELAMYYMANQILRYCDHST 
YF 

5 INTLTAVIAGSRRPPGVQAAAAWAPQGGAGLEAGARALMDSLDAHPGAWTSMFASCNLLRPVMAARPM 
W 

LGLSIS KYYGMAGNDRVFQ AGNWAS LLGGKNAC PLL I FDRTRKFVLAC PRAGFVC AAS S LGGGAH EHS 
LC 

EQLRGI IAEGGAAVASSVFVATVKSLGPRTQQLQI EDWLALLEDEYLSEEMMEFTTRAIjERGHGEWST 
10 DA 

ALEVAHEAEALVSQLGAAGEVFNFGDFGDEDDHAASFGGLAAAAGAAGVARKRAFHGDDPFGEGPPEK 
KD 

LTLDML* 

15 gi| 544182 |sp|P36384|DNBI_HSV2 MAJOR DNA- BINDING PROTEIN 

[SEQ ID NO:133] >contigl2 {start 6286 - stop 10008) translated 
MFCAAGGPTSPGGKSAARAASGFFAPHNPRGATQTAPPPCRRQNFYNPHLAQTGTQPKAPGPAQRHTY 
YS 

20 ECDEFRFIAPRSLDEDAPAEQRTGWDGRLRRAPKVYCGGDERDVLRVGPEGFWPRRLRLWGGADHAP 
EG 

FDPTVTVFHVYDILEHVEHAYSMRAAQLHERFMDAITPAGTVITLLGLTPEGHRVAVHVYGTRQYFYM 
NK 

AEVDRHLQCRAPRDLCERLAAALRESPGASFRGISADHFEAEWERADVYYYETRPTLYYRVF^ 
25 AL 

AYLCDNFC PAI RKYEGGVDATTRF ILDNPGFVTFGWYRLKPGRGNAPAQ PRPPTAFGTS S DVEFNCTA 
DN 

LAVEGAMCDLPAYKLMCFDIECKAGGEDELAFPVAERPEDLVIQISCLLYDLSTTALEHILLFSLGSC 
DL 

. 30 PESHLSDLASRGLPAPVVLEFDSEFEMLLAFMTFVKQYGPEFVTGYNIINFDWPF\^TKLTEIYKVPL 
DG 

YGRMNGRGVFRVWDIGQSHFQKRS KI KVNGMVNI DMYG I ITDKVKLSS YKLNAVAEAVLKDKKKDLSY 
RD 

I PAYYASG PAQRGVIGEYCVQDSLLVGQLFFKFLPHLELSAVARLAG INITRT I YDGQQ IRVFTCLLR 
35 LA 

GQKGFILPDTQGRFRGLDKEAPKRPAVPRGEGERPGDGNGDEDKDDDEDGDEDGDEREEVARETGGRH 
VG 

YQGARVLDPTSGFHVDPVVVFDFASLYPSIIQAHNLCFSTLSLRPEAVAHLEADRDYLEIEVGGRRLF 
FV 

40 KAHVRESLLSILLRDWLAMRKQIRSRIPQSTPEEAVLLDKQQAAIKWCNSVYGFTGVQHGLLPCLHV 
AA 

TVTTIGREMLLATRAYVHARWAEFDQLLADFPEAAGMRAPGPYSMRI I YGDTDSIFVLCRGLTAAGLV 
AM 

GDKMASHISRALFLPPIKLECEKTFTKLLLIAKKKYIGVICGGKMLIKGVDLVRKNNCAFINRTSR^ 
45 VD 
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LLFYDDWSGAAAAIJ^PAEEWLARPLPEGLQAFGAVLVDAHRRITDPERDIQDFVLT^^ 
YT 

NKRLAHLTVYYKLMARRAQVPS IKDRI PWIVAQTREVEETVARLAALRELDAAAPGDEPAPPAALPS 
PA 

5 KRPRETPSHADPPGGASKPRKLLVSELAEDPGYAIARGVPLNTDYYFSHLLGAACVTFKA^ 
TE 

SLLKRFI PETWH P PDDVAARLRAAGFGPAGAGATAEETRRMLHRAFDTLA* 
gi| 118882 |sp|P07918|DPOL_HSV21 DNA POLYMERASE 

10 

[SEQ ID NO: 134] >contigl2 (start 10870 - stop 9953) translated 

MYDIAPRRSGSRPGPGRDKTRRRSRFSAAGNPGVERRASRKSLPSHARRLELCLHERRRYRGFFAALA 
QT 

PS EEIAI VRSLSVPLVKTT PVSLPFSLDQTVADNCLTLSGMGYYLG I GGCC PAC SAGDGRLATVSREA 
15 LI 

IAFVQQINTIFEHRTFLASLVVIJUDRHSTPLQDLLADTIX5QPELFFVH 
YG 

GHMLWIFPGTSAHLHYRLIDRMLTACPGYRFAAHVWQSTFVLVTORNAEKPADAEIPTVSAADIYCK 
MR 

20 DI SFDGGLMLEYQRLYATFDEF P PP * 

gi| 136875 |sp|P10215|UL31_HSVll PROTEIN UL31 

[SEQ ID NO:135] >contigl2 (start 12674 - stop 10863) translated 

25 VRPARPAMATSAPGVPSSAAWEESPGSSWKEGAFERPWAFDPDLLALNEALCAELLAACHWGVPP 
AS 

ALDEDVESDVAPAPPRPRGAAREASGGRGPGSARGPPADPTAEGLLDTGPFAAASVDTFALDRPCLVC 
RT 

i elykqayrls pqwvadyaflc akclgaphcaas i fvaafefvyvmdhhflrtkkatlvgs farf alt 
30 in 

DIHRHFFLHCCFRTDGGVPGRHAQKQPRPTPSPGAAKVQYSNYSFLAQSATRALIGTLASGGDDGAGA 
GG 

GSGTQPSLTTALMNV^DCARLLDCTEGKRGGGDSCCTRAAARNGEFEAAAGALAQGGEPETWAYADLI 
LL 

35 LIAGTPAWESGPRLRAAADARRAAVSESWEAHRGARMRDAAPRFAQFAEPKAQPDL 
HG 

RGRGRTGGECLLCNLLLVRAYWLAMRRLRASVTOYSENNTSLFDCIVPVTOQLEADPEAQPGDGGRW 
SL 

IJ^GPEAIFKHMFCDPMCAITEMEVDPWVLFGHPRADHRDELQLHKAKLACGNEFEGRVCIALRALI 
40 YT 

FKTYQVFVPKPTALATFVREAGALLRRHS I SLLS LEHTLCTYV* 

gi| 136879 |sp|Pl0216|UL32_HSVll PROBABLE MAJOR ENVELOPE GLYC 

45 [SEQ ID NO:136] >contigl2 (start 12652 - stop 13044) translated 
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MAGRAGRTRPRTLRDAI PDC ALRSQTLESLDARYVSRDGAGDAAVWFEDMT PAELEVI F PTTDAKLNY 
LS 

RTQRLASLLTYAGPIKAPDGPAAPHTQDTACVHGELDATERERFAAVINRFLDLHQILRG* 

5 gi| 136883 |sp|P10217|UL33_HSVll PROTEIN UL33 

[SEQ ID NO: 137] >contigl2 (start 13134 - stop 13964) translated 
MAGMGKPYGGRPGDAFEGLVQRIRLIVPTTLRGGGGESGPYSPSNPPSRCAFQFHGQDGSDEAFPIEY 
VL 

0 RLMNDWADVPCNPYLRVQNTGVSVLFQGFFNRPHGAPGGAITAEQTNVILHSTETTGLSLGDLDDVKG 
RL 

GLDARPMMASMWI SCFVRMPRVQLAFRFMGPEDAVRTRRI LCRAAEQALARRRRSRRSQDDYGAVAVA 
AA 

HHSSGAPGPGVAASGPPAPPGRGPARPWHQAVQLFRAPRPGPPALLLLVAGLFLGAAIWWAVGARL* 

5 

gi|13 6888|sp|Pl0218|UL34_HSVll VIRION PROTEIN UL34 

[SEQ ID NO: 138) >contig!2 (start 14076 - stop 14414) translated 

MAAPQFHRPSTITADNVRAIX3MRGLVLATNNAQFIMDNSYPHPHGTQGAVREFLRGQAAALTDLGVTH 
0 AN 

NTFAPQ PMFAGDAAAEWLRPS FGLKRTYS PFWRDPKT PST P * 

gi| 139196 |sp|P10219|VP26_HSVll CAPSID PROTEIN VP26 

15 

[SEQ ID NO:139] = Contig ID 13 Length: 838 Type: N Check: 7960 

[SEQ ID NO: 140 ] >contigl3 (start stop 852 - 1) translated 

0 RRLYADRLTKRSIASLGRCVREQRGELEKMLRVSVHGEVLPATFAAVANGFAARARFCALTAGAGTVI 
DN 

RAAPGVFDAHRFMRASLLRHQVDPALLPSITHRFFELVNGPLFDHSTHSFAQPPNTALYYSVENVGLL 
PH 

LKEEIJ^FIMGAGGSGADWAVSEFQKFYCFDGVSGITPTQRAAWRYIRELIIATTLFASVYRCGELEL 
15 RR 

PDCSRPTSEGLYRYPPGVYLTYNSDCPLVAIVESGPIXSCIGPRSVVVTO^ 
GXXXXX 

gi| 124089 |sp|P12835|PRTP_HSVlA PROCESSING AND TRANSPORT PRO 

[SEQ ID NO:141] = Contig ID 14 Length: 2647 Type: N Check: 
2951 
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[SEQ ID NO: 142 ] >contig!4 (start 2661 - stop 97) translated 

PPVPSPATTKARKRKTKKPPKRPEATPPPDANATVAAGHATLRAHLREIKVENADAQFYVCPPPTGAT 

W 

QFEQPRRCPTRPEGQNYTEGIAVVFKENIAPYKFKATMYYKDVTVSQVWFG 
5 FE 

EVIDKINAKGVCRSTAKYVRNNMETTAFHRDDHETDMELKPAKVATRTSRGWHTTDLKYNPSRVEAFH 
RY 

GTTVNCIVEEVDARSOTPYDEFVLATGDFVYMSPFYGYREGSOT 
AR 

10 ATS PTTRNLLTT PKFTVAWDWVPKRPAVCTMTKWQEVDEMLRAEYGGS FRFS SDAI STTFTTNLTQYS 
LS 

RVDLGDC IGRDAREAI DRMFARKYNATHIKVGQPQYYLATGGFLI AYQPLLSNTLAELYVREYMREQD 
RK 

PRNAT PAPLREAPSANASVERIKTTS S I EFARLQFTYNH IQRHVNDMIjGR I AVAWC ELQNHELTLWNE 
15 AR 

KLNPNAIASATVGRRVSARMLGDVMAVSTCVPVAPDNVIVQNSMRVSSRPGTCYSRPLVSFRYEDQGP 
LI 

EGQLGENNELRLTRDALEPCWGHRRYFIFGGGYVYFEEYAYSHQLSRADVTTVSTFIDLNITMLEDH 
EF 

20 VPLEVYTRHEIKDSGLLDYTEVQRRNQLHDLRFADIDTVIRADANAAMFAGLCAFFEGMGDLGRAVGK 
W 

MGWGGWSAVSGVS SFMSNPFGALAVGLLVLAGLVAAFFAFRYVLQLQRNPMKAL Y PLTTKELKTS D 
PG 

GVGGEGEEXSAEGGGFDEAKIJVEAREMIRYMALVSAMERTEHKARKKGTSALLSSKVTNMVLRKR^ 

25 ys 

PLHNEDEAGDEDEL * 

gi|l38198|sp|P06763|VGLB_HSV23 GLYCOPROTEIN B PRECURSOR 

30 

[SEQ ID NO:143] = Contig ID 15 Length: 20389 Type: N Check: 2794 



(SEQ ID NO: 144] >contigl5 {start 788 - stop 3) translated 

35 MNAHFANEVQYT>LTRDPSSPASLIHVIISSECLAAAGVPLSALTO 
DC 

TPWRSAFAAYWADAVGAILAPVIPAHPDLLPRVPSAGGLFVSLPVACDAQGVYDPYTVAALRLAWGP 
WA 

TCARVLLFSYDELVPPNTRYAADGARLMRLCRHFCRWARLGAAAPAAATEAAAHLSLGMGESGTPTP 
40 QA 

SSVSGGAGPAWGTPDPPISPEEQLTAPGGDTATAEDVSITQENEEIXXXXX 
gi|l36835|sp|P1020l|UL17_HSVll PROTEIN UL17 

45 [SEQ ID NO: 145] >contigl5 (start 818 - stop 2089) translated 
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VPEGAWVGGACARPRGPRAHVRLYAVCFVC PQG IRGQDFNLLFVDEANF IRPDAVQTIMGFLNQANCK 
II 

FVSSTNTGKASTSFLYNLRGAADELLNVVTYICDDHMPRVVTHTNATACS^^ 
TA 

5 DLFLPDSFMQEIIGGQARETGDDRPVLTKSAGERFLLYRPSTTTNSGLMAPELYVYVDPAFTANTRAS 
GT 

GIAWGRYRDDFIIFALEHFFLRAIjTGSAPADIARCVVH 
VA 

IATHVHTEMHRILASAGANGPGPELLFYHCEPPGGAVLYPFFLLNKQKTPAFEYFIKKFNSGGVMASQ 
10 EL 

VSVTVRLQTDPVEYLSEQLNNLIETVSPNTDVRMYSGKRNGAADDLMVAVIMAIYLAAPTGIPPAFFP 
IT 

RTS* 

15 gi| 139646 |sp|P04295|VTER_HSVll PROBABLE DNA PACKAGING PROTE 

[SEQ ID NO:146] >contig!5 (start 3520 - stop 2429) translated 

VLLSPAPPPLPHGRCPPSLFHHRPGCVALSGPPAPPRSGVSRPGAMITDCFEADIAIPSGISRPDAAA 
LQ 

20 RCEGRWFLPTI RRQLALADVAHES FVSGGVS PDTLGLLLAYRRRFPAVI TRVL PTRI VAC PVDLGLT 
HA 

GTVNLRNTS PVDLCNGDPVSLVP PVFEGQATDVRLESLDLTLRFPVPLPTPLARE I VARLVARG IRDL 
NP 

DPRTPGELPDLNVLYYNGARLSLVADVQQLASVNTELRSLVLNMVYS ITEGTTLILTLI P 
25 DG 

YVNALLQMQSVTREAAQLIHPEAPMLMQDGERRLPLYEALVAWLAHAGQIX3DIIALAPAVR 
AV 

VQSGDMAPVI RYP * 

30 gi| 139191 |sp|P10202|VP23_HSVll CAPSID PROTEIN VP23 

[SEQ ID NO:147] >contigl5 (start 7954 - stop 3764) translated 

VWEGLGLPELGLMEPANPPRNPMAAPARDPPGYRYAAAMVPTGS ILSTI EVASHRRLFDFFARVRSDE 
NS 

35 LYDVEFDALLGS YCNTLSLVRFLELGLSVACVCTKFPELAYMNEGRVQFEVHQPL I ARDG PHPVEQPV 
HN 

YMTKVI DRRALNAAFSLATEAI ALLTGEALDGTGI SLHRQLRAIQQLARNVQAVLGAFERGTADQMLH 
VL 

lekapplalllpmqryldngriatrvaratlvaelkrsfcdtsfflgkaghrreaieawlvdlttatq 
40 ps 

vavprlthaotrgrpvdgvlvttaaikqrllqsflkvedte 

RS 

LDDVGRHLLEMQEEQLEANRETLDELESAPQTTRVRADLVAIGDRLVFLEALEKRIYAATNVPYPLVG . 
AM 
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DLTFVLP3JGLFNPAMERFAAHAGDLVPAPGHPEPRAFPPRQLFFWGKDHQVLRLSMENAVGTVCHPSL 
MN 

IDAAVGGVNHDPVEAANPYGAYVAAPAGPGADMQQR 
AN 

5 LALELHPAFDFFAGVADVELPGGEVPPAG PGAIQATWRWNGNL PLALC PVAFRDARGLELGVGRHAM 
AP 

ATI AAVRGAFEDRSYPAVFYLLQAAI HGSEHVFCALARLVTQC I TSYWNNTRCAAFVNDYSLVS YI VT 
YL 

GGDLPEECMAVYRDLVAHVEALAQLVDDFTLPGPEI^GQAQ 
10 AL 

drhrdcridagghepvyaaacnvatadfnrndgrllhotqara^ 

VP 

AFSRGRCCTAGVRFDRVYATLQNMWPEIAPGEECPSDPVTDPAHPLHPANLVANTVNAMFHNGRVVV 
DG 

1 5 PAMLTLQVLAHNMAERTTALLC SAAPDAGANTASTANMRI FDGALHAGVLLMAPQHLDHTIQNGEYFY 
VL 

PVHALFAGADHVANAPNFP PALRDIARHVPLVPPALGANYFSS I RQPWQHARESAAGENALTYALMA 
GY 

FKMSPVALYHQLKTGLHPGFGFTVVRQDRFVTENVLFSERASEAYFLGQLQVARHETGGGVSFTLTQP 
20 RG 

NVDLGVGYTAVAATATVRN PVTDMGNLPQNFYLGRGAP PLLDNAAAVYLRNAWAGNRLG PAQPL PVF 
GC 

AQVPRRAGMDHGQDAVCEFIATPVATDINYFRRPCNPRGRAAGGVYAGDKEGDVIALMYDHGQSDPAR 
PF 

25 AATANPWASQRFSYGDLLYNGAYHLNGASPVLSPCFKFFTAADITAKHRCLERLIVETGSAVSTATAA 
SD 

VQFKRPPGCRELVEDPCGLFQEAYPITCASDPALLRSARDGEAHARETHFTQYLIYDASPLKGLSL* 
gi| 137571 |sp|P0649l|VCAP_HSVll MAJOR CAPSID PROTEIN (MCP) 

30 

[SEQ ID NO:148] >contigl5 (start 8869 - stop 8201) translated 

MTMRDDVPLLDRELVYEAACGGEDGELPLDEQFSLSSYGTSDFFVSSAYSRLPPHTQPVFSKRWMFA 
WS 

FLVLKPLELVAAGMYYGWTGRAVAPAC 1 1 AAVLAYYVTWLARALLLYVNI KRDRLPLS PPVFWGLCVI 
35 MG 

GAALCALVAAAHETFSPTCLFHWITASQLLPRTDPLRARSIX3IACAAGAAMWAAADCFAAFTO 
RF 

WTRAI LKAPVAF * 

40 gi|l3684l|sp|P10204|UL20_HSVll MEMBRANE PROTEIN UL20 

[SEQ ID NO:149] >contigl5 (start 9205 - stop 11118) translated 

VGRQGERWVGGGNEENTQRATSGMRPELSLKGRPCVTEAWCPSTDAAIHSGGSSSVRPQPYARAARA 
RA 
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THGSRSRHRQPLLPPPSSHHPTIPPPPSPPRGSPAMELSYATTLHHRDVVFYVTADRNRAYFVCGGSV 
YS 

VGRPRDSQPGEIAKFGLVVRGTGPKDRMVAlSrm^SELRQRGLRDVRPVGEDEVFLDSVCLLNPNVSSE 
RD 

5 VINTNDVEVLDECLAEYCTSLRTSPGVLWGTVRVRARD^ 
QA 

HLPRLPS S LE PLVSGLFDG I PAPRQPLDARDRRTDWITGTRAPRPMAGTGAGGAGAKRATVSEFVQV 
KH 

IDRWSPSVSSAPPPSAPDASLPPPGLQEAAPPGPPLRELWWVFYAGDRALEEPHAESGLTREEVRAV 
10 HG 

FREQAWKLFGSVGAPRAFLGAAIJUjSPTQKLAVYYYLIHRERRMSPFPALVRI>V 
DE 

PTLADAMNGLFRDALAAGTVAEQLLMFD^^ 
MY 

1 5 LGAFU5VLYAGHGRIJUUVrHTARLTG^ 
AR 

AQHGQSV* 

gi| 136845 |sp|P10205|UL21_HSVll PROTEIN UL21 

20 

[SEQ ID NO: 150] >contig 15 (start 14107 - stop 11339) translated 
VSISAGVRGQGWHRISTPPKNGAGRSVLVFGLVLPLCFYPHPTPSFGPRLRQQRASDSLRGAEPLWAV 
GT 

OTPPSADWQPGRTTMGPGLWVVMGVLVGVAGGHDTYWTEQIDPWFLHGLGLARTYWRDTNTGRLWLPN 
25 TP 

DASDPQRGRIAPPGELNLTTASVPMLRWYAERFCFVLVTTAEFPRDPGQLLYIPKTYLLGRPRNASLP 
EL 

PEAGPTSRPPAEVTQLKGLSHNPGASALLRSRAWVTFAAAPDREGLTFPRGDDGATERHPDGRRNAPP 

■PG 

30 PPAGTPRHPTTNLSI AHLHNASVTWLAARGLLRTPGRYVYLS PSASTWPVGVWTTGGLAFGCDAALVR 
AR 

YGKGFMGLVI SMRDSPPAEI IVVPADKTLARVGNPTDENAPAVLPGPPAGPRYRVFVLGAPTPADNGS 
AL 

DALRRVAGYPEESTOTAQYMSRAYAEFIjGEDPGSGTDARPSLFWRLAGLLAS sg fafvnaahahdai r 

35 ls 

dllgflahsrvlaglaargaagcaadsvflnvsvldpaari^ 
QL 

afvldspaaygavapsaarlidalyaeflggraltapmvrralfyatavlrapflagapsaeqrerar 

RG 

40 LLITTALCTSDVAAATHADLRAALARTDHQKNLFV^ 

IP 

ADVMAQQTRGVASVLTRWAHYNALIRAFVPEATHQCSGPSHNAEPRILVPITHNASYVVTHTPLPRGI 
GY 

KLTGVDVRRPLFITYLTATCEGHAREIEPKRLVRTENRRDLGLVGAVFLRY 
45 QQ 
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QLAQGPVAGTPNWSSDVPSVALLLFPNGTVIHLLAFDTLPIATIAPGFIiAASALGVVMITAALAGIL 
RV 

VRTCVPFLWRRE* 

5 gi| 138315 |sp|P06477|VGLH_HSVll GLYCOPROTEIN H PRECURSOR 

[SEQ ID NO: 151] >contigl5 (start 15322 - stop 14192) translated 
MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASEARGDPELPTLLRVYI DGPHGVGKTTTS AQ 
LM 

10 EAIXSPRDNIVYVPEPMTYWQVLGASETLTNIYNT^ 
VL 

APHIGGEAVGPQAPPPALTLVFDRHPIASLLCYPAARYLMGSMTPQAVLAFVALMPPTAPGTNLVLGV 
LP 

EAEHADRLARRQRPGERLDLAMLSAIRRWDLLANTVRYLQRGGRWREDWGRLTGVAAATPRPDPEIX3 
15 AG 

SLPRIEDTLFALFRVPELLAPNGDLYHIFAWVLDVLADRLLPMHLFVLDYDQSPVGCRDALLRLTAGM 
IP 

TRVTTAGS I AEI RDLARTFAREVGGV* 

20 . gi| 125438 |sp|P04407|KITH_HSV23 THYMIDINE KINASE 

[SEQ ID NO: 152] >contigl5 (start 15005 - stop 16069) translated 

VLRWDVRQGLGGPQHLPVSHRLGDVDDIVARPQGLHQLRGGGGLPHPVGSVYINPQQRGQLRIPAGF 
GG 

25 PLAMARTGRRAAVGRPARTSSLTERRRVLLAGVRSHTRFYKAFARETO 
GR 

SLFEATRVTLICEVDLGPRRPDCICVFEFANDKTLGGVCVILELKTCKSISSGDTASKREQRTTGMKQ 
LR 

HSLKLLQSLAPPGDKVVYLCPILVFVAQRTLRVSRVTRLVPQKISGNITAAVRMLQSLSTYAVPPEPQ 
30 TR 

RSRRRVAATARPQRPPS PTRDPEGTAGHPAPPESDPPS PGWGVAAEGGGVLQKI AALFCVPVAAKSR 
PR 

TKTE* 

35 gi|136854|sp|Pl0208|UL24_HSVll PROTEIN UL24 

[SEQ ID NO:153] >contig!5 (start 16350 - stop 18107) translated 
MDPYYPFDALDVWEHRRFIVADSRSFITPEFPRDFWMLPVFNIPRETAAERAAVLQAQRTAAAAALEN 
AA 

40 LQAAELPVDIERRI RPI EQQVHH I ADALEALETAAAAAEEADAARDAEARGEGAADGAAPS PTAGPAA 
AE 

MEVQIVRNDPPLRYDTNLPVDLLHMVYAGRGAAGSSGVVFGTWYRTIQER 
MS 

KTFMTALVLSLQSCGRLYVGQRHYS AFECAVLCLYLLYRTTHES S PDRDRAPVAFGDLLARLPRYLAR 
45 LA 
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AVIGDESGRPQYRYRDDKLPKAQFAAAGGRYEHGAIATHWIATLVRHGVLPAAPGDVPRDTSTRVNP 
DD 

VAHRDDVNHRAAAAPLARGH^ 
GA 

5 VPAEAIARGASGLDSGAIKSGDNNLEALCVNYVLPLYQADPTVELTQLFPGLAALCLDAQAGRPIAST 
RR 

WDMS SGARQAALVRLTALELINRTRTOTTPVGEI INAHDALGIQYEQGLGLIAQQARIGIASNAKRF 
AT 

FNVGSDYDLLYFLCLGFIPQYLSVA* 

10 

gi| 136863 |sp|P10209|UL25_HSVll VIRION PROTEIN UL25 

[SEQ ID NO:154] >contigl5 (start 18328 - stop 20256) translated 
VRVPMASAFMRERLEAPLPDRAVPIYVAGFLALYDSGDPGELA^ 
15 CE 

VGRVLA\A/NDPRGPFFVGLIACVQLERVLETAASAAIFERRGPALSREERLLYLITNYLPSVSLSTKR 
RG 

DEVPPDRTLFAHVALCAIGRRLGTIVTYDTSLDAAIAPFRHLDPATREGVRREAAEAELALAGRTWAP 
GV 

20 EALTHTLLSTAVNNMMLRDRWSLVAERRRQAG I AGHTYLQAS EKFK IWGAE S APAPERGYKTGAPGAM 
DT 

SPAASVPAPQVAVRARQVASSSSSSSSFPAPADMNPVSASGAPAPPPPGDGSYLWIPAFHYNQLVTGQ 
SA 

phhppltacglpaagtvayghpgagpsph yp pp pahpypgmlfag ps pleaqi aalvgai aadrqagg 
25 lp 

aaagdhgirgsakrrrheveqpeydcgrdepdrdfpyypgearpeprpvdsrraarqasgphetital 

VG 

AVTSLQQELAHMRARTHAPYGPYPPVGPYHHPHADTETPAQPPRYPAEAVYLPPPHIAPPGPPLSGAV 
PP 

30 PSYPPVAVTPGPAPPLHQPSPAHAHPPPPPPGPTPPPAASLPQPEAPGAEAGALVNASSAAHVNVDTA 
RA 

ADLFVSQMMGSR* 

>gi | 529230 UL26 [Herpes simplex virus type 1] 

35 



[SEQ ID NO: 155) = Contig 16 Length: 11707 Type: N Check: 6054 

40 

[SEQ ID NO: 156] >contigl6 (start 190 - stop 2) translated 

MEA PG IVWVEESVS AITLYAVWL PPRTRDCLHALLYLVCRDAAGEARARFAEVSVGSSXXXXX 

gi | 136802 | sp | P10192 | HEPA^HSVll DNA HELICASE/PRIMASE COMPLEX 

45 
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[SEQ ID NO: 157] >contigl6 (start 2855 - stop 240) translated 

MAETMNVATCTHQTHHAARAPGATSAPGAASGDPLGARRP IGDDEC EQ YTS SVSLARMLYGGDLAEWV 
PR 

VHPKTTI ERQQHGPOTFPDASAPTARCOTWRAPMGSGKTTALIRWLGEAIHS PDTSVLWSCRRSFT 
5 QT 

tATRFAESGLPDFVTYFS STNYIMNDRPFHRLIVQVE S LHRVGPNLLNNYDVLVLDEVMSTLGQLYS P 
TM 

QQLGRVDALMLRLLRTCPRI I AMDATANAQLVDFLCSLRGEKNVHWI GEYAMPGFSARRCLFL PRLG 
PE 

10 VLQAALRPPGPAGGAPPPDAPPDATFFGELEARLAGGDNVCIFSSTVSFAEWARFCRQFTDRVLLLH 
SL 

TPPGDVTTWGRYRWIYTTVVWGLSFDPPHFDSMFAYVKPMNYGPDWSVYQSLGR 
YM 

IX3SGARSEPWTPMLLNHWSASGQWPAQFSQVTNLLCRRFKGRCDASHADAAQARGSRIYSKFRYKH 
15 YF 

ERCTLACLADSLN I LHMLLTLNCMHVRFWGHDAALTPRNFCLFLRG I HFDALRAQRDLRELRCQDPDT 
SL 

S AQAAETEEVGLFVEK YLRPDVAPAEWALMRGLNSLVGRTRFI YLVLL EACLRVPMAAH S SAI FRRL 
YD 

20 HYATGVIPTINAAGELELVALHPTLWAPVWELFRLCSTMAACLQWDSMAGGSGRTFSPEDVLELLNP 
HY 

DRYMQLVFELGHCNVTDGPLLSEDAVKRVADALSGCPPRGSVSETEHALSLFKIIWGELFGVQLAKST 
QT 

FPGAGRVKNLTKRAIVELLDAHRIDHSACRTHRQLYALLMAHKREFAGARFKLRAPAWGRCLRTHASG 
25 AQ 

PNTDI ILEAALSELPTEAWPMMQGAVNFSTL* 

gi|l36806|sp|Pl0193|OBP_HSVll ORIGIN OF REPLICATION BINDING 

30 [SEQ ID NO: 158] >contigl6 (start 2707 - stop 4137) translated 

VYCSHS S S PMGRRAPRGS PEAAPGADVAPGARAAWWWCVQVATFI VS AIC WGLLVLASVFRDRFPC 
LY 

APATSYAEANATVEVRGGVAVPLRLDTQSLIATYAITSTLLIJUVAVYAAVGA 
AA 

35 RMAMPHATLIAGNVCAWLLQITVLLLAHRISQLAHLIYVLHFACLVYLAAHFCTO 
LI 

DPAPTHHRI VGPVRAVMTNALLLGTLLCTAAAAVSLNT I AALNFNFSAPSMLI CLTTLFALLWSLLL 
W 

EGVLCHYVRVLVGPHI^AIAATGIVGLACEHYHTGGYYVVEQQWPGAQTGVRVALAL 

40 rc 

TRAYLYHRRHHTKFFVRMRDTRHRAHSALRRVRSSMRGSRRGGPPGDP 
DS 

DGDPIYDEVAPDHEAELYARVQRPGPVPDAEPIYDTVEGYAPRSAGEPVYSTVRRW* 
45 gi 1 136810 | sp | P04288 | VGLM_HSV11 GLYCOPROTEIN 
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[SEQ ID NO: 159] >contig!6 (start 4621 - stop 4331) translated 

MGIAFSGARPCCCRHNVIITDGGEWSLTAHEFDWDIESEEEGNFYVPPDMR\A^ 
PP 

5 SRHTRRRDPDVARPPATLTPPLSDSE* 

gi| 136816 |sp|Pl3294|ULll_HSV2 HYPOTHETICAL UL11 PROTEIN 

[SEQ ID NO:160] >contigl6 (start 6399 - stop 4537) translated 

10 MAAAATPGAKRPADPARDPDSPPKRPRPNSLDLATVFGPRPAPPRPTSPGAPGSHWPQSPPRGQPDGG 
AP 

GEKARPASPALSEASSGPPTPDIPLSPGGAHAIDPDCSPGPPDPDPMWSASAIPNALPPHILAETFER 
HL 

RGLLRGVRS PLAIG PLWARLDYLCSLVVSLEAAGMVDRGLGRHLWRLTRRAPPSAAEAVAPRPLMGFY 
15 EA 

ATQNQADCQLWALLRRGLTTASTLRWGAQGPCFSSQWLTHNASLRLDAQSSAVMFGRVNEPTARNLLF 
RY 

CVGRADAGWDDADAGRFVFHQPGDLAEENVHACGVLMDGHTGMVGASLDILVCPRDPHGYLAPAPQT 
PL 

20 AFYEVKCRAKYAFDPADPGAPAASAYEDLMARRSPEAFRAFIRSIPNPGVRYFAPGRVPGPEEALVTQ 
DR 

DWLDSRAAGEKRRCSAPDRALVELNSGWSEVLLFGVPDLERRTISPVAWSSGELVRREPIFANPRHP 
NF 

KQILVQGYVLDSHFPDCPLQPHLVTFLGRHRAGAEEGVTFRLEDGRGAPAGRGGAPGPAKASILPDQA 
25 VP 

IALIITPVRVEPGIYRDIRRNSRLAFDDTLAKLWASRSPGRGPAAADTTSSSPTAGRSSR* 

gi| 119694 |sp|P06489|EXON_HSV2 ALKALINE EXONUCLEASE 

30 [SEQ ID NO: 161] >contigl6 (start 8023 - stop 6440) translated 

VGGRRPGGRMDESGRQRPASHVAADISPQGAHRRSFKAWLASYIHSLSRRASGRPSGPSPRDGAVSGA 
RP 

' GSRRRSSFRERLRAGLSRWRVSRSSRRRSSPEAPGPAAKLRRPPLRRSETAMTSPPSPPSHILSLARI 
HK 

35 LC I PVFAVNPALRYTTLE I PGARSFGGSGGYGEVQLI REHKLAVKT I REKEWFAVELVATLLVGECAL 

RG . 

GRTHDIRGFITPLGFSLQQRQIVFPAYDMDLGKYIGQLASLRATTPSVATALHHCFTDLARAWFLNT 
RC 

GI SHLDIKCANVLVMLRSDAVSLRRAVLADFSLVTLNSNSTISRGQFCLQEPDLES PRGFGMPAALTT 
40 AN 

FHTLVGHGYNQPPELLVKYLNNERAEFNNRPLKHDVGLAVDLYALGQTLLELLVSVYVAPSLGVPVTR 
VP 

GYQYFNNQLS PDFAVALLAYRCVLHPALFVNSAETNTHGLAYDVPEGIRRHLRNPKIRRAFTEQC INY 
QR 

45 THKAVLSSVSLPPELRPLLVLVSRLCHANPAARHSLS * 
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gi| 125628 |sp|P04290|KR2_HSVll PROBABLE S ERINE / THREON INE - PRO 

[SEQ ID NO: 162] >contigl6 "(start 8409 - stop 7750) translated 
5 MSRDASHAALRRRLAETOLRAEVYRDQTLQLHREGV 
MM 

RQRATCVKIRVEEQAARRDFLTAHRRYLDPALSERLDAAI)DRLADQEEQLEEAAANASL 
WM 

SPGDSDLLVMWQLTSAPKVHTDAPSRPGSRPTYTPSAAGRPDAQAAPPPETAPSPEPAPGPAADPASG 
10 SG 

FARDCPDGE* 

gi|l36823|sp|P0429l|ULl4_HSVll HYPOTHETICAL UL14 PROTEIN 

15 [SEQ ID NO:163) >contigl6 (start 8295 - stop 9788) translated 

VYSRPPGVAAGSGPCTPRPGGASRPNVGAGPRGWRLGSSRRPRARPTSDSFAPTPLTSAAPASPAMFG 
QQ 

LASDVQQYLERLEKQRQQKVGVDEASAGLTLGGDALRVPFLDFATATPKRHQTWPGVGTLHDCCEHS 
PL 

20 FSAVARRLLFNSLVPAQLRGRDFGGDHTAKLEFLAPELVRAVARLRFRECAPEDAVPQRNAYYSVLNT 
FQ 

ALHRSEAFRQLVHFVRDFAQLLKTSFRASSLAETTGPPKKRAKVDVATHGQTYGTLELFQKMILMHAT 
YF 

IAAVLLGDHAEQWTFLRLVFEIPLFSDTAVRHFRQRATVFLVPRRHGKTWFLVPLIALSLASFRGIK 
25 IG 

YTAHIRKATEPVFDEIDACLRGWFGSSRVDHVKGETISFSFPDGSRSTIVFASSHNTNVSTPSSRGAC 
FP 

GAALPEIDRQTNTARRECGTTRPQPPPPWRGEALLFICNRTMRLWPRPARPRGSSLQTGGWYTMTERR 
GA 

30 TRRWSGG* 

gi|139646|sp|P04295|VTER_HSVll PROBABLE DNA PACKAGING PROTEIN 

[SEQ ID NO:164] >contigl6 (start 10626 - stop 9661) translated 

35 VWRVVRGDERLKIFRCLTVLTEPLCQVALPDPDPERALFCEIFLYLTRPKALRLPSNTFFAIFFFNRE 
RR 

YCATVHLRSVTHPRTPLLCTLAFGHLEAASPPEETPDPAAEQLADEPVAHELDGAYLVPTEPPPNPGA 
CC 

ai/3PGawwhlpggriycwamdddu;six:ppgsrarh]^wllsritdppggggacaptahidsanal™ 
40 ap 

avaeacpcvapcmwsnmaqrtlavrgdaslcqllfghpvdavilrqatrrpritahlhevvvgrdgae 

SV 

IRPTSAGWRLCVLS SYTSRLFATSC PAVARAVARASSSDYK* 
45 gi 1 136829 | sp| P10200 |UL16_HSV11 PROTEIN UL16 
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[SEQ ID NO: 165} >contigl6 (start 11723 - stop 10881) translated 

LTEACAAERVVRPHQLSPAAQTALLRRFPi^ 

IR 

5 AALQGGPRIFQRLRYDFGPHQSEWLGEVTRRFPVLLENI^R^EGTAPDAFFHTAYAIAVLAH 
GR 

GRRRRLVPLSDDIPARFADSDAHYAFDYYSTSGDTLRLTNRPIAWIDGDVNGREQSKCRFMEGSPST 
AP 

HRVCEQYLPGESYAYLCLGFNRRLCGLWFPGGFAFTINTAAYLSLADPVARAVGLRFCRGAATGPGL 
10 VR 

gi| 136835 |sp|P10201|UL17_HSVll PROTEIN UL17 

15 

[SEQ ID NO: 166] = Contig ID 17 Length: 732 Type: N Check: 3911 



[SEQ ID NO: 167] >contigl7 (start 747 - stop 1) translated 

20 PAASPLEPLGDPTLWRALYACVLAALERQTGPVAL 
LD 

VEAKVDVDPLALAARVAEH PGARLAWARLAAIRDS PQCAS SASLAVTI TTRTARF AREYTTLAF P PTS 
KE 

GAFADLVEVCEVGLRPRGH PQRVTARVLL PRGYDYFVS AGDGFSAPALVALFRQWHTTVHAAPG ALAP 

25 vf 

AFLGAGFDVRGGPVQYFAVLGFPGWPTFTVPAAAXXXXX 



gi| 136802 |sp|Pl0192|HEPA_HSVll DNA HELICASE/ PRIMASE COMPLEX 

30 



[SEQ ID NO: 168] = Contig ID 18 Length: 3006 Type: N Check: 
6117 

35 

[SEQ ID NO: 169] >contigl8 (start 2 - stop 673) translated 

XXXXXALEREQRAADRAAGGGAGRPAEADLLRADYDIIDVSKSMDDDTYVANSFQHQYIPAYGQDLER 

LS 

40 RLWEHELVRCFKILRHRNNQGQETSISYSSGAIASFVAPYFEYVLRAPRAGALITGSDVILGEEELWE 
AV 

FKKTRLQTYLTDVAALFVADVQHAALPRPPS PTPADFRASAS PRGGSRSRTRTRSRS PGRTPRGAPDQ 
GW 

GVERRDGRPHARR* 

45 
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gi| 136794 |sp|P10190|UL06_HSVll VIRION PROTEIN UL6 

[SEQ ID NO:i70] >contigl8 (start 612 - stop 1538) translated 

VRRTRAGASNAGMADPTPADEGTAAAILKQAIAGDRSLVEVAEGISNQALLRMACEVRQVSDRQPRFT 

5 AT 

SVLRTOOTPRGRLRFVLIXSSSDDAWASEDYFKRCGDQP 
RL 

SLFRPTDLRDFELVCLLMYLENC PRSHATPSLFVKVSAWLGWARHAS PFERVRCLLLRSCHWILNTL 
MC 

1 0 MAGVKPFDDELVLPHWYMAHYLIaANNPPPVLSALFCATPQSSALQLPGPVPRTDCVAYNPAGW 
KS 

KDLRSALVYWWLSGS PKRRTSS LF YRFC * 

gi|136798|sp|P1019l|UL07_HSVll PROTEIN UL7 

15 

[SEQ ID NO: 171] >contigl8 (start 3021 - stop 1795) translated 

ACLGAWPAVGARWLP PRAWPAVAS EAAGRLLPAFREAVARWHPTATTI QLLDPPAAVGPVWTARFCF 
SG 

LQAQLIAALAGIX5EAGLPEARGRAGLERLDALVAAAPSEPWARAVLERLVPDACDACPALRQLLGGVM 
20 AA 

VCLQI EQTAS SVKFAVCGGTGAAFWGLFNVDPGDADAAHGAI QDARRALEASVRAVLS ANG I RPRLAP 
SL 

ALEGVYTHVOTWSQTGAWFWNSRDDTDFLQGFPLRGPAYAAAAEVMRDALRRILRRPAAGPPEEAVCA 
AR 

25 GIMEDACDRFVIJDAFGRRLDAEYWSVLTPPGEADDPLPQTAFRGGALLDAEQYT^VVRVC PGGGESV 
GV 

PVDLYPRPLVLPPVDCAHHLREILREIQLWTGVLEGVWGEGGSFVYPFEEKMRFLFP* 

gi| 136802 |sp|P10192|HEPA__HSVll DNA HELICASE/ PRIMASE COMPLEX 

30 

(SEQ ID NO: 172] = Contig ID 2 Length: 429 Type: N Check: 5672 

35 

[SEQ ID NO:173] = Contig ID 3 Length: 15901 Type: N Check: 
1337 

[SEQ ID NO: 174] >contig3 (start 1547 - stop 2791) translated 

40 MADIPPDPPALNTTPANHAPPSPPPGSRKRRRPVLPSSSESEGKPDTESESSSTESSEDEAGDLRGGR 
RR 

SPRELGGRYFLDLSAESTTGTESEGTGPSDDDDDDASDGWLVDTPPRKSKRPRINLRLTSSPDRRAGV 
VF 

PEJVWRNDRPIRAAQPQAPAQSSGDRAAAPRRSARQAQMRSGAAOTLDI^YIRQCVNQLFRILRAA 
45 PG 
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SANRLRHLVRDCYLMGYCRTRLGPRTWGRLLQISGGTWDVRLRNAIREVEARFEPAAEPVCELPCLNA 
RR 

YGPECDVGNLETNGGSTSDDEISDATDSDDTLASHSDTEGGPSPAGRENPESASGGAIAARLECEFGT 
FD 

5 OTSEEGSQPWLSAVVADTSSAERSGLPAPGACRATEAPEREDGCRKMRFPAACPYPCGHTFLRP* 

gi| 124184 |sp|P04485|lE68_HSVll IMMEDIATE- EARLY PROTEIN IE68 

(SEQ ID NO:175) >contig3 (start 3848 - stop 2973) translated 

10 MGVVWSVVTLLDQRNALPRTSADASPALWSFLLRQCRII^ 
FT 

RPI VRTRSCRCPPNTTTGLFAEDDPLES I EILDAPACFRLLHQERPGPHRLYHLWWGAADLCVPFLE 
YA 

QKTRLGFRFIAMKTNDAWVGEPWPLPDRFLPERTVSWTPFPAAPNHPLENLLSRYEYQYGVWPGDRE 
15 RS 

CLRWLRSLVAPHNKPRPASSRPHPATHPTQRPCFTCMGRPEIPDEPSWQTGDDDPQNPGPPIiAVGDEW 
PP 

SSHVCYPITNL* 

20 gi| 137125 |sp|P13292|US02JHSV2 PROTEIN US2 

[SEQ ID NO: 176] >contig 3 (start 4044 - stop 5579) translated 

VGGCVDKLPLLKTPGPVARGARWLARATRRMACRKFCGVYRRPDKRQEASVPPETNTAPAFPASTFYT 
PA 

25 EDAYLAPGPPETIHPSRPPSPGEAARLCQLQEILAQMHSDEDYPIVDAAGAEEEDEADDDAPDDVAYP 
ED 

YAEGRFLSMVSAAPLPGASGHPPVPGRAAPPDVRTCDSGKVGATGFTPEELDTMDREALRAISRGCKP 
PS 

TLAKLVTGI/SFAIHGALIPGSEGCVFDSSHPNYPHRVIVKAGWYASTNHEARLLRRLNHPAI^ 
30 HV 

VSGVTCLVLPKYHCDLYTYLSKRPSPLGHLQITAVSRQLLSAIDYVHCEGIIHRDIKTENILINTPEN 
IC 

LGDFGAACFVRGCRSS PFH YGI AGTI DTNAPEVLAGDPYTQVI D I WSAGLVI FETAVHTASLFS APRD 
PE 

35 RRPCDNQIARIIRQAQVHVDEFPTHAESRLTAHYRSRAAGNNRPAWTRPAOTRYTKIHTDVEYL 
LT 

FDAALRPSAAELLRLPLFHPK* 

gi 1 125617 | sp| P13287 | KR1_HSV2 SERINE /THREONINE -PROTEIN KINASE 

40 

[SEQ ID NO: 177] >contig3 (start 8255 - stop 8368) translated 

VGGLCLMILGMACLLEVLRRLGRELARCC PHAGQFAP* 

gi| 137132 |sp|Pl3293|VGLJ_HSV2 GLYCOPROTEIN J 

45 
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[SEQ ID NO: 178] >contig 3 (start 8791 - stop 9993) translated 

VCIAYHGMGRLTSGVGTAALLWAVGLRWCAKYAI^PSLKMADPNRFRGKNLPVLDQLTDPPGVKR 

VY 

HIQPSLEDPFQPPSIPITVYYAVLERACRSVLLHAPSEAPQIWGASDEIARKHTYI^TIAWYRMGDNC 
5 AI 

PITVMEYTEC PYNKS LGVC PI RTQ PRWSYYDSFS AVS EDNLGFLMHAPAFETAGTYLRLVKINDWTE I 
TQ 

FILEHRARASCKYALPLRIPPAACLTSKAYQQGVTVDSIGMLPRFIPENQRTVALYSLKIAGWHGPKP 
PY 

10 TSTLLPPELSDTTNATQPELVPEDPEDSALLEDPAGTVSSQIPPNWHIPSIQDVAPHHAPAAPSNPGL 
II 

GALAGSTLAVLVIGGIAFWVRRRAQMAPKRLRLPHIRDDDAPPSHQPLFY* 
gi| 138234 |sp|P03172|VGLD_HSV2 GLYCOPROTEIN D PRECURSOR 

15 

[SEQ ID NO: 179 ) >contig3 (stare 10012 - stop 11313) translated 

VYLWARVGGWLGYLGGTWTPHKGSLEGGKLGQFIGRERGARTAVPTISHRAHSHLDPSDPGMPGRSLQ 
GL 

AI LGLWVCATGLWRGPTVSLVSDSLVDAGAVGPQGFVEEDLRVFGELHFVGAQVPHTNYYDGI I ELF 
20 HY 

PLGNHC PRWHWTLTACPRRPAVAFTLC RSTHHAHS PAYPTLELGLARQPLLRVRTATRDYAGLYVL 
RV 

WGSATNASLFVLGVALSANGTFVYNGSDYGSCDPAQLPFSAPRLGPSSVYTPGASRPTPPRTTTSPS 
SP 

25 RDPTPAPGDTGTPAPASGERAPPNSTRSASESRHRLTVAQVIQIAI PASI IAFVFLGSCICFIHRCQR 
RY 

RRPRGQIYNPGGVSCAVNEAAMARLGAELRSHPNTPPKPRRRSSSSTTMPSLTSIAEESEPGPWLLS 
VS 

PRPRSGPTAPQEV* 

30 

gi|l38328|sp|P06764|VGLI_HSV23 GLYCOPROTEIN I 

(SEQ ID NO:180] >contig3 (start 11632 - stop 12984) translated 

MARGAGLVFFVGVWWSCIAAAPRTSWKRVTSGEDVVLLPAPAGPEERTRAHKLLWAAEPLDACGPLR 

35 ps 

WVALWPPRRVLEWVDAACMRAPEPLAIAYSPPFPAGDEGLYSEIJ^WRDRVAVVNESLVIYGALETDS 
GL 

YTLSWGLSDEARQVASWLWEPAPVPTPTPDDYDEEDDAGVSERTPVSVPPPTPPRRPPVAPPTHP 
RV 

40 I PEVS HVRGVTVHMETPEAI LFAPGETFGTNVS I HAI AHDDGPYAMDWWMRFDVPS SC AEMRI YEAC 
LY 

HPQLPECLSPADAPCAVSSWAYTUJVVRSYAGCSRTTPPPRCFAEARMEPVPGLAWLASTVNLEFQHA^ 
PQ 

HAGLYLCVVYVDDHIHAWGHMTISTAAQYRNAVVEQHLPQRQPEPVEPTRPHVRAPPPAPSAR 
45 GA 
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VLGAALLLAALGLS AWGVHDLLAQALLAGG * 

gi| 138240 |sp|P04488|VGLE_HSVll _ GLYCOPROTEIN E PRECURSOR 

5 [SEQ ID NO: 181] >contig3 {start 13431 - stop 13568) translated 

VALHAVDAP SQFVTWLAVRWLRG AVGLGAVLCG I AFYVTS I ARGA * 

gi | 1944544 |gnl | PID| e312381 US 8 A 

10 [SEQ ID NO: 182] >contig3 (start 13668 - stop 13937) translated 

MTSRPADQDSVRS S ASVPLYPAAS PVPAEAYYSES EDEAANDFLVRMGRQQS VLRRRRRRTRCVGLVI 
AC 

LWALLSGGFGALLVWLLR* 

15 gi| 135568 |sp|P0648l|TEGP_HSVll TEGUMENT PHOSPHOPROTEIN US 9 

[SEQ ID NO:183) >contig3 (start 15333 - stop 14425) translated 

MIRRRGNVEIRVYYESVRPSRSRSHLKPSDHQEFPGHHVSPGSPGFPESPGNREFHDLPENPGSRAYP 
GT 

20 RDPHDPHGCPGSLDPHGNPAQPAGLPSPVPYAPLGSPDPSSPRQRTYVLPRVGIRNAPASDTRAPKRA 
HS 

RHRADRPPES PGSELYPLNAQALAHLQMLPADHRAFFRTVI EVSRLC ALNTHDPPPPLAGARVGQEAQ 
LV 

HTQWLRANRESSPLWPWRTA?VMNFIAAAAPCVQTHRHMHDLLMACAFWCCLAHASTCSYA 

25 HL 

FRAFGCGPPVLTTSRGQGGWCN* 

gi| 137138 |sp|P06486|USlO_HSVll VIRION PROTEIN US10 

30 

[SEQ ID NO: 184] = Contig ID 4 Length: 179 Type: N Check: 5124 
[SEQ ID NO: 185] = Contig ID 5 Length: 2117 Type: N Check: 9467 



[SEQ ID NO: 186] >contig5 (start 1020 - stop 1) translated 

MLNDMQW^SSDSEEETEVGISDDDLHRDSTSEAGSTDTEMFEAGLMDAATPPARPPAERQGSPTPAD 
40 AQ 

GSCGGGPVGEEEAEAGGGGDVCAVCTDEIAPPLRCQSFPCLHPFCIPCMKTWIPLRNTCPLCNTPVAY 
LI 

VGVTASGSFSTIPIVNDPRTRVEAEAATOSGTAVDFIWTGNPRTAPRSLSLGGHTVRALSPTPPWPGT 
DD 



100 



WO 98/20016 



PCT/US97/20016 



EDDDLADGEGGRGSGTGRGSGTGRGSGTGRGSGTGRGSGGGRAGVGHWAGVGRGXGTNRGFPSLSPSA 
AD 

YVPPAPRRAPRRGGGGAGATRGTSQ PAATRPAP PGAPRS SS SGGAPLRAGVGSGSXXXXX 
5 gi| 124135 |sp|P28284|lCP0_HSV2H TRANS - ACTING TRANSCRIPTIONAL 



0 



[SEQ ID NO:187] = Contig 6 Length: 643 Type: N Check: 5042 



[SEQ ID NO: 188] = Contig 7 Length: 354 Type: N Check: 9326 



[SEQ ID NO: 189} = Contig 8 Length: 6387 Type: N Check: 4794 



5 



[SEQ ID NO: 190} >contig8 (start3 - stop 1454) translated 

XXXXXTRRICARGPALPPGGLAVGGQMYVNRNEI FNAALAVTNI I LDLD I ALKEPVPF PRLHEALGH F 
RR 

GALAAVQLLFPAARVDPDAYPCYFFKSACRPRAPPVCAGDGPSAGGDDGDGDWFPDAGGDDGDEEWEE 
0 DT 

DPMDTTHGPLPDDEAAYLDLLHEQIPAATPSEPDSWCSCADKIGLRVCLPVPAPYWHGSLTMRGVA 
RV 

IQQAVLLDRDFVEAVGSHVKNFLLIDTGVYAHGHSLRLPYFAKIGPDGSACGRLLPVFVIPPACEDVP 
AF 

5 VAAHADPRRFHFHAPPMFSAAPREIRVLHSLGGDYVSFFEKKASRNALEHFGRRETLTEVLGRYDVRP 
DA 

GETVEGFASELLGRI VACI EAHFPEHAREYQAVS VRRAVIKDDWVLLQLI PGRGALNQSLSCLRFKHG 
RA 

SRATARTFLALSVGTNNRLCASLCQQCFATKCDNNRLHTLFTVDAGTPCSRSAPSSTSRPSSS* 

0 

gi|l36939|sp|P1023 6|UL52_HSVll DNA HELICASE/ PRIMASE COMPLEX 

[SEQ ID NO: 191} >contig8 (start 1406 - stop 2422) translated 

MLAVRSLQHLTTVI F ITAYGLVLAWYI VFGAS PLHRC I YAVRPAGAHNDTALVWMKINQTLLFLG PPT 
5 AP 

PGGAWTPHAHVCYANI IEGRAVSLPAI PGAMSRRVMNVHEAVNCLEALWDTQMRLVWGWFLYLAFVA 
LH 

QRRCTFGWSPAHSMVAPATYLLNYAGRIVSSWLQYPYTKITRLUTELSVQRQTLVQLFE^PVTFL 
YH 

0 RPAVGVIVGCELLLRFVALGLIVGTALISRGACAITYPLFLTITTWCFVSIIALTELYFILRRDSAPK 
NA 

EPAAPRGRSKGWSGVCGRCCSIILSGIAVRLCYIAWAGWLMALRYEQEIQRRLFDL* 

gi| 116105 |sp|P22485|CELF_HSV2H CELL FUSION PROTEIN PRECURSOR 

5 
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[SEQ ID NO: 192 J >contig8 (start 2752 - stop 4506) translated 

VTPDGEGQGGVSESRPRSCGYKGSHRPTGRCVLPCADPGCASVPLLDSDPATLFRHAPPRRTPAIPAP 
AT 

YNMATDIDMLIDLGLDLSDSELEEDALERDEEGRRDDPESDSSGECSSSDEDMEDPCGDGGAEAIDAA 
5 IP 

KGPPARPEDAGTPEASTPRPAARRGADDPPPATTGVWSRLGTRRSASPREPHGGKVARIQPPSTKAPH 
PR 

GGRRGRRRGRGRYGPGGADSTPNPRRRVSRNAHNQGGRHPASARTDGPGATHGEARRGGEQLDVSGGP 
RP 

10 RGTRQAPPPLMALSLTPPHADGRAPVPERKAPSADTIDPAVRAVLRSISERAAVERISESFGRSALVM 
QD 

PFGGMPFPAANSPWAPVIATQAGGFDAETRRVSWETLVAHGPSLYRTFAANPRAASTAJCAMRIX 
EN 

LIEAIASADETI^WCKMCIHHNLPLRPQDPIIGTAAAVLENIATRLRPFLQCYLKARGLCGLDDLCSR 
15 RR 

LSDIKDIASFVLVILARIJ^RVERGVSEIDYTWGVGAGETMHFYIPGACMAGLIEILDTHRQECSSR 
VC 

ELTASHTIAPLYVHGKYFYCNSLF* 

20 gi| 124181 |sp|P28276|lE63_HSV2H TRANSCRIPTIONAL REGULATOR IE 

[SEQ ID NO: 193] >contig8 (start 4638 - stop 5282) translated 

. MWGPGPARFIARPGTHGRRVFTDPPPRNMTTTPLSNLFLRAPDITHVAPPYCLNATWQAENALHTTKT 
DP 

25 ACLAARSYLVRASCSTSGPIHCFFFAVYKDSQHSLPLVTELRNFADLVNHPPVLRELEDKRGGRLRCT 
GP 

FSCGTI KDVSGAS PAGEYT ING I VYHCHCRY PFSKTCWLG ASAALQHLRS I SS SGTAARAAEQRRHKI 
KI 

KIKV* 

30 

gi| 136947 |sp|P2828l|UL55_HSV2H PROTEIN UL55 

[SEQ ID NO: 194] >contig8 (start 5808 - stop 5455) translated 

MIGAHPGVGGDLPSGLPTYAEATSDRPPTYAMVMAACPTEPPGGSVGPADQPRVQSSRTWRPPLVNSR 
35 EL 

YRAQRAARCASS SDTPQAPGWCGGTCRHAVFGVVAVVVVI ILAFLWR * 
gi|136952|sp|P28282|UL56_HSV2H PROTEIN UL56 

40 

[SEQ ID NO: 195] = Contig 9 Length: 3700 Type: N Check: 8257 



45 
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[SEQ ID NO: 196) >contig9 (start 2 - stop 355) translated 

XXXXXGGHAAAGLTELCQTIAPRDLTDPLLFAW SLTQVL 
GL 

RRRLHKDPDAG PWAAATLRGLFFSVYALGFAAGVLVRPRMAASRRSG * 

5 

gi| 136909 |sp|Pl0227|UL43_HSVll MEMBRANE PROTEIN UL43 

[SEQ ID NO: 197] >contig9 (start 453 - stop 2099) translated 

MGAGVPWTGI KARGAGGP ITVRVLGWEVAQKATHPCC SC PREAWSGNPPRCAGRAHRS FAGAGALLV 
10 MA 

. LGRVGLAVGLWGLLWVGVVVVLANAS PGRTITVG PRGNASNAAPS AS PRNAS A PRTTPTP PQ PRKATK 
SK 

ASTAKPAPPPKTGPPKTSSEPVRCNRHDPLARYGSRVQIRCRFPNSTRTESRLQIWRYATATDAEIGT 
AP 

15 SLEEVMVNVSAPPGGQLVYDSAPNRTDPHVIWAEGAGPGAS PRLYS WGPLGRQRLI IEELTLETQGM 
YY 

WVWGRTDRPSAYGTWVRVRWRPPSLTIHPHAVLEGQPFKATCTAATYYPGNRAEFVWFEDGRRVFDP 
AQ 

IHTQTQENPDGFSTVSTVTSAAVGGQGPPRTFTCQLTWHRDSVSFSRRNASGTASVLPRPTITMEFTG 
20 DH 

AVCTAGCVPEGVTFAWFLGDDSSPAEKVAVASQTSCGRPGTATIRSTLPVSYEQTEYICRLAGYPDGI 
PV 

LEHHGSHQPP PRDPTERQVI RAVEGAG IGVAVLVAWLAGTAWYLTHAS S VR YRRLR * 

25 gi| 138220 |sp|P06475|VGLC_HSV23 GLYCOPROTEIN C PRECURSOR 

[SEQ ID NO: 198] >contig9 (start 2266 - stop 2847) translated 

VGSKRLRKRAPRPDIQARGGAMAFRASGPAYQPLAPAASPARARVPAVAWIGVGAIVGAFALVAALVL 
VP 

30 PRS SWGLSPCDSGWQEFNAGCVAWDPTPVEHEQAVGGCS APATLI PRAAAKHLAALTRVQAERS SG YW 
WV 

NGDGIRTCLRLVDSVSGIDEFCEELAIRICYYPRSPGGFVRFVTSIRNALGLP* 
gi|136917|sp|P06483|UL45_HSV23 PROTEIN UL 4 5 HOMOLOG 

35 

[SEQ ID NO:199] >contig9 (start 3716 - stop 3114) translated 

QRPAAAARPLAAQREAAGVYDAVRTWGPDAEAEPDQMENTYLLPDDDAAMPAGVGLGATPAADTTAAA 
WP 

AESHAPRAPSEDADSIYESVSEDGGRVYEEIPWVRVYENICLRRQDAGGAAPPGDAPDSPYIEAENPL 
40 YD 

WGGSALFSPPGATRAPDPGLSLSPMPARPRTNALANDGPTNVAALSALLTKLKRGRHQSH* 
gi|114350|sp|P10230|ATl2_HSVll ALPHA TRANS -INDUCING FACTOR 

45 
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TABLE 3 



5 



[SEQ ID NO: 200 3 



Contig ID 2 



{SEQ ID NO: 2011 



Contig ID 3 



10 



[SEQ ID NO:202] 



Contig ID 4 



15 



[SEQ ID NO: 203] 



Contig 



ID 5 



[SEQ ID NO:204] 



Contig ID 7 



20 



{SEQ ID NO: 205] = Contig ID 12 

25 [SEQ ID NO:206] 

ORF # = 1 from Contig ID 12 
ORF start site = 120 
ORF end site = 1371 
ORF sequence : 

30 MADIPPDPPALNTTPANHAPPSPPPGSRKRRRPVLPSSSESEGKPDTESESSSTESSEDEAGDLRGGR 
RR 

SPRELGGRYFLDLSAESTTGTESEGTGPSDDDDDDASDGWLVDTPPRKSKRPRINLRLTSSPDRRAGV 
VF 

PEVWRNDRPIRAAQPQAPAQSSGDRAAAPRRSARQAQMRSGAAWTLDLHYIRQCVNQLFRILRAAPNP 
35 PG 

SANRLRHLVRDCYI^GYCRTRLGPRTWGRLLQISGGTVTOVRLRNAI^ 
RR 

YGPECDVGNLETNGGSTSDDEISDATDSDDTLASHSDTEGGPSPAGRENPESASGGAIAARLECEFGT 
FD 

40 WTSEEGSQPWLSAWADTSSAERSGLPAPGACRATEAPEREDGCRKMRFPAACPYPCGHTFLRP* 

Gene matched :gi |l24184|sp|P04485| IE68_HSV11 
Gene name: IMMEDIATE- EARLY PROTEIN IE68 

45 [SEQ ID NO: 207] 
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ORF # = 2 from Contig 12 
ORF start site = 2428 
ORF end site = 1553 
ORF sequence: 
5 MGWWSWTLLDQRNALPRTSADASPALWSFLLRQC^ 
FT 

RPIVRTRSCRCPPNTTTGLFAEDDPLESIEILDAPACFRLLHQERPGPHRLYHLWVVGAADLCVPFLE 
YA 

QKTRLGFRFIAMKTNDAWVGEPWPLPDRFLPERTVSWTPFPAAPNHPLENLLSRYEYQYGVWPGDRE 
10 RS 

CLRWLRSLVAPHNKPRPASSRPHPATHPTQRPCFTCMGRPEIPDEPSWQTGDDDPQNPGPPLAVGDEW 
PP 

SSHVCYPITNL* 

15 Gene matched: gi | 137125 | sp | P13292 |US02_HSV2 
Gene name: PROTEIN US2«gi | 419137 |pir| |A4 

[SEQ ID NO: 208] 
ORF # 3 from Contig 12 
20 ORF start site = 2714 
ORF end site = 4159 
ORF sequence: 

MACRKFCGVYRRPDKRQEASVPPETNTAPAFPASTFYTPAEDAYLAPGPPETIHPSRPPSPGEAARLC 
QL 

25 QEIIjAQMHSDEDYPIVDAAGAEEEDEADDDAPDDVAYPEDYAEGRFLSMVSAAPLPGASGHPPVPGRA 
AP 

PDVRTCDSGKVGATGFTPEELDTMDREALRAISRGCKPPSTLAKLVTGLGFAIHGALIPGSEGCVFDS 
SH 

PNYPHRVIVKAGWYASTNHEARLLRRLNHPAILPLLDLHWSGVTCLVLPKYHCDLYTYLSKRPSPLG 
30 HL 

QITAVSRQLLSAIDYVHCEGIIHRDI KTENILINTPENICLGDFGAACFVRGCRSSPFHYGIAGTIDT 
NA 

PEVLAGDPYTQVIDIWSAGLVIFETAVHTASLFSAPRDPERRPCDNQIARIIRQAQVHVDEFPTHAES 
RL 

35 TAHYRSRAAGNNRPAWTRPAWTRYYK IHTDVEYLIC KALTFDAALRPSAAELLRL PLFHPK * 

Gene matched: gi | 125617 | sp | P13 287 | KR1_HSV2 
Gene name: SERINE /THREONINE -PROTEIN KINAS 

40 [SEQ ID NO: 209) 

ORF # = 4 from Contig 12 

ORF start site = 6835 

ORF end site = 6948 

ORF sequence: 
45 VGGLCLMILGMACLLEVLRRLGRELARCCPHAGQFAP* 
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Gene matched: gi 1 137132 | sp| P132 93 |VGLJ_HSV2 
Gene name: GLYCOPROTEIN J*gi | 419140 | pir | 

5 

{SEQ ID NO: 210 J 
ORF # = 5 from Contig 12 
ORF start site = 7392 
10 ORF end site = 8573 
ORF sequence: 

MGRLTSGVGTAALLWAVGLRWCAKYALADPS 
LE 

DPFQPPSIPITVYYAVLERACRSVLLHAPSEAPQIVRGASDEARKHTYNLTIAWYRMGDNCAIPITVM 
15 EY 

TECPYNKSLGVCPIRTQPRWSYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEH 
RA 

RASCKYALPLRIPPAACLTSKAYQQGVTVDSIGMLPRFIPENQRTVALYSLKIAGWHGPKPPYTSTLL 
PP 

20 ELSDTTNATQPELVPEDPEDSALLEDPAGTVSSQI PPNWHI PSIQDVAPHHAPAAPSNPGLI IGALAG 
. ST 

LAVLVIGGIAFWVRRRAQMAPKRLRLPHIRDDDAPPSHQPLFY* 

Gene matched: gi | 138234 ] sp | P03172 |VGLD_HSV2 
25 Gene name: GLYCOPROTEIN D PRECURSOR 

[SEQ ID NO: 211] 
ORF # = 6 from Contig 12 
ORF start site = 8775 
30 ORF end site = 9893 
ORF sequence: 

MPGRSLQGIjAI LGLWVCATGLWRG PTVSLVSDSLVDAGAVGPQGFVEEDLRVFGELHFVGAQVPHTN 
YY 

DGI IELFHYPLGNHCPRVVHVVTLTACPRRPAVAFTLCRSTHHAHS PAYPTLELGLARQPLLRVRTAT 
35 RD 

YAGLYVLRVWGSATNASLFVK3VALSANGTFVYNGSDYGSCDPAQLPFSAPRU3PSSVYTPGASRPT 
PP 

RTTTSPSSPRDPTPAPGDTGTPAPASGERAPPNSTRSASESRHRLTVAQVIQIAIPASIIAFVFLGSC 
IC 

40 FIHRCQRRYRRPRGQIYNPGGVSCAVNEAAMARLGAELRSHPNTPPKPRRRSSSSTTMPSLTSIAEES 
EP 

GPWLLSVSPRPRSGPTAPQEV* 

Gene matched: gi | 138328 | sp| P06764 |VGLI_HSV23 
45 Gene name: GLYCOPROTEIN I A Agi | 73722 |pir | 
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(SEQ ID NO:212) 
ORF # = 7 from Contig 12 
5 ORF start site = 10212 
ORF end site = 11858 
ORF sequence: 

MARGAGLVFFVGVWWSCLAAAPRTSWKRVTSGEDWLL PAPAGPEERTRAHKLLWAAE PLDACGPLR 
PS 

10 WVALWP PRRVLETWDAACMRAPEPLAI AYS P PFPAGDEGL YSELAWRDRVAWNESLVI YGALETDS 
GL 

YTLSWGLSDEARQVASWLWEPAPVPTPTPDDYDEEDDAGVSERTPVSVPPPTPPRGPPVAPPTHP 
RV 

IPEVSHVRGVTVHMETPEAILFAPGETFGTNVSIHAIAHD^^ 
15 LY 

H PQL PECLS PADAPCAVSS WAYRLAVRS YAGC SRTTP PPRC FAEARMEPVPGLAWLASTVNLEFQHAS 
PQ 

HAGLYLCWYVDDHI HAWGHMT I STAAQYRNAWEQHL PQRQPEPVE PTRPHVRAPP PAPS ARGPLRL 
GA 

20 VLGAALLLAALGLSAWACMTCWRRRSWRAVKSRASATGPTYIRVADSELYADWSSDSEGERDGSLWQD 
PP 

ERPDSPSTNGSGFEILSPTAPSVYPHSEGRKSRRPLTTFGSGSPGRRHSQASYSSVLW* 

25 Gene matched: gi | 138240 |sp| P0448 8 |VGLE_HSV11 
Gene name: GLYCOPROTEIN E PRECURSOR 

[SEQ ID NO: 213] 
ORF # = 8 from Contig 12 
30 ORF start site = 12010 
ORF end site = 12147 
ORF sequence: 

VALHAVDAPSQFVTWLAVRWLRGAVGLGAVLCGIAFYVTS IARGA* 

35 

Gene matched: gi| 1944544 |gnl | PID|e312381 
Gene name: (X14112) US8A [human herpesvirus 

40 [SEQ ID NO: 214] 

ORF # = 9 from Contig 12 
ORF start site = 12247 
ORF end site = 12516 
ORF sequence: 
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MTSRPADQDSWSSASVPLYPAASPVPAEAYYSESEDEAANDFLVRMGRQQSVLRRRRRRTRCVGLVI 
AC 

LWALLSGGFGALLVWLLR* 

5 

Gene matched: gi | 135568 | sp| P06481 |TEGP_HSV11 
Gene name: TEGUMENT PHOSPHOPROTEIN US 9 

10 [SEQ ID NO:215] 

ORF # = 10 from Contig 12 
ORF start site = 13912 
ORF end site = 13004 
ORF sequence: 

15 MIRRRGNVEIRVYYESVRPSRSRSHLKPSDHQEFPGHHVSPGSPGFPESPGNREFHDLPENPGSRAYP 
GT 

RDPHDPHGCPGSLDPHGNPAQPAGLPSPVPYAPLGSPDPSSPRQRTYVLPRVGIRNAPASDTRAPKRA 
HS 

RHRADRPPESPGSELYPLNAQALAHLQMLPADHRAFFRTVIEVSRLCALNTHDPPPPLAGARVGQEAQ 

20 lv 

htqwlranressplwpwrtaamnfiaaaapcvqthrhmhdlimcafwcclahastcsyaglysahcq 

HL 

FRAFGCG P PVLTTSRGQGGWCN * 

.25 

Gene matched: gi | 137138 | sp| P06486 |US10_HSV11 
Gene name: VIRION PROTEIN US10 

30 (SEQ ID NO: 216] 

ORF # = 11 from Contig 12. 
ORF start site = 15899 
ORF end site = 16582 
ORF sequence: 

35 MSAEQRKKKKTTTTTQGRGAEVAMADEDGGRLRAAAETTGGPGSPDPADGPPPTPNPDRRPAARPGFG 
WH 

GGPEENEDEDDDAAADADADEAAPASGEAVDEPAADGWSPRQI^LASMVDEAVRTIPSPPP 
EE 

eaars ps pprtpsmcadygeenddddddddrdagrwvrgpendvrg prgvpgphgq pvaatpgap ptp 
40 pp 

ppppppparpppaldrl* 

Gene matched: gi | 124141 | sp | P08392 | ICP4_HSV11 
45 Gene name: TRANS -ACTING TRANSCRIPTIONAL 
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tSEQ ID NO: 217] = Contig ID 15 

5 

[SEQ ID NO:218] 
ORF # = 1 . from Contig 15 
ORF start site = 755 
ORF end site = 1297 
10 ORF sequence: 

MRTPADDVSWRYEAPSVIDYARIDGIFLRYHCPGLDTFLWDRHAQRAYLVNPFLFAAGFLEDLSHSVF 

PA 

DTQETTTRRALYKEIRDALGSRKQAVSHAPVRAGCVNFDYSRTRRCVGRRDLRPANTTSTWEPPVSSD 
DE 

1 5 ASSQSKPLATQPPVLALSNAPPRRVSPTRGRRRHTRLRRN* 

Gene matched: gi | 136776 | sp| P2 8278 |VGLL_HSV2H 
Gene name: GLYCOPROTEIN L PRECURSOR- gi | 

20 

[SEQ ID NO: 219] 
ORF # = 2 from contig 15 
ORF start site = 1170 
25 ORF end site = 2174 
ORF sequence: 

MKRARSRSPSPPSRPSSPFRTPPHGGSPRREVGAGILASDATSHVCIASHPGSGAGYPTRLAAGSAVQ 
RR 

RPRGCPPGVMFSASTTPEQPLGLSGDATPPLPTSVPLDWAAFRRAFLIDDAWRPLLEPELANPLTARL 
30 LA 

EYDRRCQTEEVLPPREDVFSWTRYCTPDDVRWI IGQDPYHHPGQAHGLAFSVRADVPVPPSLRNVLA 
AV 

KNCYPDARMSGRGCLEKWARDGVLLLNTTLTVKRGAAASHSK 
GA 

35 HAQNAIRPDPRQHYVLKFSHPSPLSKVPFGTCQHFLAANRYLETRDIMPIDWSV* 

Gene matched: gi 1 137037 | sp| P10186 |UNG_HSV11 
Gene name: URACIL- DNA GLYCOSYLASE 

40 

[SEQ ID NO:220] 
ORF # = 3 from Contig 15 
ORF start site = 2229 
45 ORF end site = 2930 
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ORF sequence: 

MVKSRVSYRSVMSGVGEERVPSAFTIIASWGOTFAPQN^ 
PD 

RDAISPLTSSVAGDPPGAIX3PYVTFOTLFMVSSID 
5 GT 

AARQRKRGAPPQRTCVPRSNKSLQMF\^CKRANAAQVREQLRAVIRSRKPRKYYTRSSDGRLCPAVPV 
FV 

HEFVSSEPMRLHRDNVMLSTEPD* 

10 

Gene matched: gi | 136782 | sp| P28279 |UL03_HSV2H 
Gene name: PROTEIN UL3 

15 [SEQ ID NO: 221] 

ORF # = 4 from contig 15 
ORF start site = 3735 
ORF end site = 3130 
ORF sequence: 

20 MGNPQTTIAYSLHHPRASLTSALPDAAQWHVFESGTRAVLTRGRARQDRLPRGGWIQHTPIGLLVI 
ID 

CRAEFCAYRFIGRASTQRLERWWDAHMYAYPFDSWSSSHGESVRSATAGILTVVWTPDTIYITATIY 
GT 

APEAARGCDNAPLDVRPTTPPAPVSPTAGEFPANTTDLLVEVLREIQISPTLDDADPTPGT* 

25 

Gene matched: gi | 136788 | sp| P28280 | UL04_HSV2H 
Gene name: PROTEIN UL4 *gi | 73890 | pir | |WM 

30 

[SEQ ID NO: 222) 
ORF # = 5 from Contig 15 
ORF start site = 6447 
ORF end site = 3802 
35 ORF sequence: 

MAASGGEGSRDVRAPGPPPQQPGARPAVRFRDEAFLNFTSMHGVQPIIARIRELSQQQLDVTQVPRLQ 
WF 

RDVAALE^PTGLPLREFPFAAYLITGNAGSGKSTCVQTLNE^DCVVTGATRIAAQNMYVKLSGAFLS 
RP 

40 INTIFHEFGFRGNHVQAQLGQHPYTIASSPASLEDLQRRDLTYYWEVILDITKRALAAHGGEDARNEF 
HA 

LTALEQTLGIX3QGALTRLASVTHGALPAFTRSNIIVIDEAGLLGRHLLTTVVYCWWM 
GR 

LRPVLVCVGSPTQTASLESTFEHQKLRCSVRQSENVLTYLICNRTLREYTRLSHSWAIFINNKRCVEH 
45 EF 
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GNLMKVLEYGLPITEEHMQFVDRFWPESYITNPANLPG^ 
FV 

VPTLPVLTFVSVKEFDEYRRLTQQPTLTMEKWITANASRITNYSQSQDQDAGHVRCEVHSKQQLWAR 
ND 

5 ITYVLNSQVAOTARLRKMVFGFDGTFRTFEAVLRDDSFVKT^ 
LQ 

RPGLDATQRTLAYGRLGELTAELLSLRRDAAGASATRAADTSDRS PGERAFNFKHLG PRDGGPDDFPD 
DD 

LDVIFAGLDEQQLDVFYCHYALEEPETTAAVHAQFGLLKRAFLGRYLILRELFGEVFESAPFSTYVDN 
10 VI 

FRGCELLTGS PRGGLMSVALQTDNYTLMGYTYTRVFAFAEELRRRHATAGVAEFLEES PLPYI VLRDQ 
HG 

FMSVVNTNISEFVESIDSTEIiAMAINADYGISSKLAMTITRSQGLSLDKVAICFTPGNLRLNSAYVAM 
SR 

15 TTS S EFLHMNLNPLRERHERDDVI SEH I L S ALRDPNWI VY * 



Gene matched: gi | 122809 | sp| P10 189 | HELI_HSV11 
Gene name: PROBABLE HELICASE 

20 

[SEQ ID NO: 223] 
ORF # = 7 from Contig 15 
ORF start site = 8457 
25 ORF end site = 9347 
ORF sequence: 

MADPTPADEGTAAAILKQAIAGDRSLVEVAEGISNQALLRMACEVRQVSDRQPRFTATSVLRVDVTPR 
GR 

LRFVLDGSSDDAYVASEDYFKRCGDQPTYRGFAVWLTANEDHVHSLAVPPLVLLHRLSLFRPTDLRD 
30 FE 

LVCLLMYLENCPRSHATPSLFVKVSAWLGWARHAS PFERVRCLLLRSCHWI LNTLMCMAGVKPFDDE 
LV 

LPHWYMAHYLLANNPPPVLSALFCATPQSSALQLPGPVPRTDCV^ 
, WL 

35 SGSPKRRTSSLFYRFC* 

Gene matched: gi [136798 | sp| P10191 |UL07_HSV11 
Gene name: PROTEIN UL7 

40 [SEQ ID NO:224j 

ORF # = 8 from Contig 15 
ORF start site = 11855 
ORF end site = 9604 
ORF sequence: 
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meapgivweesvsaitlyavwlpprtrix:lhallylvcrdaageararfaevsvgssdlqdfygspd 
vs 

AAGAVAAARAAPAASPLEPLGDPTLWRALYACVLAALERQTG 
WG 

5 PPAAPRAALLDVEAKVDVDPLALAARVAEH PGARLAWARLAAIRDS PQCAS SASLAVTITTRTARFAR 
EY 

TTLAFPPTSKEGAFADLVEVCEVGLRPRGHPQRVTARVLLPRGYDYFVSAGDGFSAPALVALFRQWHT 
TV 

HAAPGALAPVFAFLGPGFEVRGGPVQYFAVLGFPGWPTFTVPAAAAAESARDLVRGAAATHAACLGAW 
10 PA 

VGARVVLP PRAWPAVASEAAGRLLPAFREAVARWH PTATTIQLLDPPAAVGPVWTARFC FSGLQAQLL 
AA 

LAGLGEAGLPEARGRAGLERLDALVAAAPS EPWARAVLERLVPDACDAC PALRQLLGGVMAAVC LQI E 
QT 

1 5 AS SVKFAVCGGTGAAFWGLFNVDPGDADAAHGAI QDARRALEAS VRAVL SANG I RPRLAPSLALEGVY 
TH 

VWWSQTGAWFWNSRDDTDFL(^FPLRGPAYAAAAEVMRDALRRILRRPAAGPPEEAVCAARGIMED^ 
CD 

RFVLDAFGRRLDAEYWSVLTPPGEADDPLPQTAFRGGALLDAEQYWRRVVRVCPGGGESVGVPVDLYP 

20 rp 

lvlppvdcahhlreilreiqlvftgvlegvwgeggsfvypfeekmrflfp* 

Gene matched: gi | 136802 | sp | P10192 | HEPA_HSV11 
Gene name: DNA HELICASE/ PRIMASE COMPLEX 

25 

[SEQ ID NO: 225] 
ORF # = 10 from Contig 15 
ORF start site = 14399 
30 ORF end site = 15802 
ORF sequence: 

MGRRAPRGSPEAAPGADVAPGARAAWWVWCVQVATFIVSAICWGLLVLASVFRDRFPCLYAPATSYA 
EA 

NATVEVRGGVAVPLRLDTQSLLATYA I TSTLLLAAAVYAAVGAVTSRYERALDAARRLAAARMAMPHA 
35 TL 

IAGNVCAWLLQITVLLLAHRISQLAHLIYVLHFACLWLAAHFCTRGVLSGTYLR 
RI 

VGPVRAVMTNALLLGTLLCTAAAAVSLNTIAALNFNFSAP^^ 
VR 

40 VLVGPHLGAIAATGIVGLACEHYHTGGYYVVEQQWPGAQTGTO 
RR 

HHTKFFVRMROTRHRAHSALRRVRSSMRGSRRGGPPGDPGYAETPYASVSHHAEIDRYGDSDGDPIYD 
EV 

APDHEAELYARVQRPGPVPDAEPIYDTVEGYAPRSAGEPVYSTVRRW* 

45 
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Gene matched: gi | 136810 | sp| P042 88 |VGLM_HSV11 
Gene name: GLYCOPROTEIN M 

5 [SEQ ID NO:226] 

ORF # = 11 from Contig 15 
ORF start site = 16286 
ORF end site = 15996 
ORF sequence: 

10 MGLAFSGARPCCCRHNVT ITDGGEWSLTAHEFDWDI ESEEEGNFYVPPDMRWTRAPGPQYRRASD 
PP 

SRHTRRRDPDVARPPATLTPPLSDSE* 

15 Gene matched: gi | 136816 | sp| P13 294 |UL11_HSV2 
Gene name: HYPOTHETICAL UL11 PROTEIN 



[SEQ ID NO: 227] 
20 ORF # = 12 from Contig 15 
ORF start site = 18064 
ORF end site = 16202 
ORF sequence: 

MAAAATPGAKRPADPARDPDSPPKRPRPNSLDIATVFGPRPAPPRPTSPGAPGSHWPQSPPRGQPDGG 

25 ap 

GEKARPASPALSEASSGPPTPDIPLSPGGAHAIDPDCSPGPPDPDPMWSASAIPNALPPHILAETFER 
HL 

RGLLRGVRSPI^IGPLWARLDYLCSLWSLEAAGMVDRGIXSRHLWRLTRRAPPSAAEAVAPRPLMGFY 
EA 

30 ATQNQADCQLWALLRRGLTTASTLRWGAQG PCFSSQWLTHNASLRLDAQS SAVMFGRVNEPTARNLLF 
RY 

CVGRADAGVNDDADAGRFVFHQPGDIAEENVHACG VLMDGHTGMVGASLD I LVC PRDPHGYLAP APQT 
PL 

AFYEVKCRAKYAFDPADPGAPAASAYEDLMARRS PEAFRAF IRS I PNPGVRYFAPGRVPGPEEALVTQ 
35 DR 

DWLDSRAAGEKRRCS APDRALVELN SGWSEVLLFGVPDLERRTI S PVAWS SGELVRREP I FANPRHP 
NF 

KQILVQGYVLDSHFPIX:PLQPHLVTFLGRHRAGAEEGVTFRLEDGRGAPAGRGGAPGPAKASILPDQA 
VP 

40 IALIITPVRVEPGIYRDIRRNSRIAFDDTLAKLWASRSPGRGPAAADTTSSSPTAGRSSR* 



Gene matched: gi | 119694 | sp| P0 6489 | EXON_HSV2 
Gene name: ALKALINE EXONUCLEASE»gi | 33025 

45 
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[SEQ ID NO:228] 
ORF # = 13 from Contig 15 
ORF start site = 19661 
5 ORF end site = 18107 
ORF sequence: 

MDESGRQRPASHVAADISPQGAHRRSFKAWLASYIHSLSRRASGRPSGPSPRDGAVSGARPGSRRRSS 
FR 

ERIJlAGLSRVmVSRSSRRRSSPEAPGPAAKLRRPPLRRSETAMTSPPSPPSHILSIARIHKI^IPVFA 
10 VN 

PALRYTTLE I PGARS FGGSGGYGEVQLI REHKLAVKT I REKEWFAVELVATLLVGECALRGGRTHDIR 
GF 

ITPLGFSLQQRQIVFPAYDMDLGKYIGQLASLRATTPSVATALHHCFTDLARAWFLNTRCGISHLDI 
KC 

15 AN^VMLRSDAVSLRRAVLADFSLVTLNSNSTISRGQFCLQEPDLESPRGFGMPAALTTANFHTLVGH 
GY 

NQPPELLVKYL^ERAEFNNRPLKHDVGLAVD 
QL 

S PDFAVALLAYRCVLHPALFVNSAETNTHGLAYDVPEGIRRHLRNPKIRRAFTEQC INYQRTHKAVLS 
20 SV 

SLPPELRPLLVLVSRLCHANPAARHSLS * 

Gene matched: gi | 125628 | sp | P04290 | KR2_HSV11 

Gene name: PROBABLE SERINE / THREONINE- PROTEIN KINASE 

25 

[SEQ ID NO: 229 ] 
ORF # = 14 from Contig 15 
ORF start site = 20074 
30 ORF end site = 19415 
ORF sequence: 

MSRDASHAALRRRLAETHLRAEVYRDQTLQLHREGVSTQDPRFVGAFM 
MM 

RQRATCVKIRVEEQAARRDFLTAHRRYLDPALSERLDAADDRIADQEEQLEEAAANASLWGDGDLADG 
35 WM 

SPGDSDLLVMWQLTSAPKVHTDAPSRPGSRPTYTPSAAGRPDAQAAPPPETAPSPEPAPGPAADPASG 
SG 

FARDCPDGE* 

40 

Gene matched: gi 1 136823 | sp | P04291 |UL14_HSV11 
Gene name: HYPOTHETICAL UL14 PROTEIN 

45 [SEQ ID NO:230j 



114 



WO 98/20016 



PCT/US97/20016 



ORF # = 15 from Contig 15 
ORF start site = 20155 
ORF end site = 21453 
ORF sequence: 
5 MFGQQ 

LASDVQQYLERLEKQRQQKVGVDEASAGLTLGGDALRVPFLDFATATPKRHQTWPGVGTLHDCCEHS 
PL 

FSAVARRLLFNSLVPAQLRGRDFGGDHTAKLEFLAPELVRAVARLRFRECAPEDAVPQRNAYYSVLNT 
FQ 

10 ALHRSEAFRQLVHFVRDFAQLLKTSFTIASSLAETTGPPKKRAKVDVATHGQTYGTLELFQKMILMHAT 
YF 

LAAVLLGDHAEQWTFLRLVFEIPLFSDTAVRHFRQRATVFLVPRRHG 
IG 

YTAHIRKATEPVFDEIDACLRGWFGSSRVDHVKGETISFSFPDGSRSTIVFASSHNTNVSTPSSRGAC 
15 FP 

GAALPEIDRQTNTARRECGTTRPQPPPPWRGEALLFICNRTMRLWPRPARPRGSSLQTGGWYTMTERR 
GA 

TRRWSGG* 

20 Gene matched: gi | 139646 | sp| P04295 |VTER_HSV11 
Gene name: PROBABLE DNA PACKAGING PROTEIN 



[SEQ ID NO:231] 
25 ORF # = 16 from Contig 15 
ORF start site = 22291 
ORF end site = 21326 
ORF sequence: 

VWRVVRGDERLKIFRCLTVLTEPLCQVALPDPDPERALFCEIFLYLTRPKALRLPSNTFFAIFFFNRE 
30 RR 

YCATVHLRS VTH PRT PLLCTLAFGHLEAAS PPEET PDPAAEQLADEPVAHELDGAYLVPTEPPPNPGA 
CC 

ALGPGAWWHLPGGRIYCWAMDDDIXSSLCPPGSRARHLGWLLSRITDPPGGGGACAPTAHIDSAN^ 
AP 

35 AVAEAC PCVAPCMWSNMAQRTLAVRGDASLCQLLFGH PVDAVI LRQATRRPRITAHLHEVWGRDGAE 
SV 

IRPTSAGWRLCVLSSYTSRLFATSC PAVARAVARASS SDYK * 

40 Gene matched: gi | 136829 | sp| P10200 |UL16_HSV11 
Gene name: PROTEIN UL16 



[SEQ ID NO: 232] 
45 ORF # = 17 from Contig 15 
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ORF start site = 24654 
ORF end site = 22546 
ORF sequence: 

MNAHFANEVQYDLTRDPS S PAS LI HVI I S SECLAAAGVPLS ALVRGRPDGGAAANFRVETQTRAHATG 
5 DC 

TPWRS AFAAYVPADAVGA I LAPVI PAH PDLLPRVPSAGGLFVSL PVACDAQGVYDPYTVAALRLAWGP 
WA 

TCARVLLFSYDELVPPNTRYAADGARLMRLCRHFCRYVARLGAAAPAAATEAAAHLSLGMGESGTPTP 
QA 

10 S SVSGGAGPAWGTPDPP I S PEEQLTAPGGDTATAEDVS I TQENEEI LALVQRAVQDVTRRHPVRARP 
KH 

AASGVASGLRQGALVHQAVSGGALGASDAEAVLAGLEPPGGGRFATPGGPRAAGEDVLNDVLTLVPGT 
AK 

PRSLVEWLDRGWEALAGGDRPDWLWSRRS I SWLRHHYGTKQRFVWSYENSVAWGGRRARPPRLSSE 
15 LA 

TALTEACAAERWRPHQLSPAAQTALLRRFPALEGPLRHPRPVLQPFDIAAEVAFVARIQIACLRALG 
HS 

IRAALQGGPRIFQRLRYDFGPHQSEWIX3EVTRRFPVLLENI>MRALEGTAPDAFFHTAYALAVIAHLGG 
QG 

20 GRGRRRRLVPLSDDI PARFADSDAH YAFDYYSTSGDTLRLTNRPI AWI DGDVNGREQSKCRFMEGS P 
ST 

APHRVCEQYLPGESYAYLCLGFNRRLCGLWFPGGFAFTINTAAYLSLADPVARAVGLRFCRGAATGP . 

GL 

VR* 

Gene matched: gi | 136835 | sp| P10201 |UL17_HSV11 
Gene name: PROTEIN UL17 

30 

[SEQ ID NO-.233] 
ORF # = 18 from Contig 15 
ORF start site = 24684 
ORF end site = 25955 
35 ORF sequence: 

VPEGAWVGGACARPRGPRAHVRLYAVCFVCPQGIRGQDFNLLFVDEANFIRPDAVQTIMGFLNQANCK 
II 

FVSSTNTGKASTSFLYNLRGAADELLJTVVTYICDDHMPRVVTHTNATACSCYILNKPVF I TMDGAVRR 
TA 

40 DLFLPDSFMQEI IGGQARETGDDRPVLTKSAGERFLLYRPSTTTNSGLMAPELYVYVDPAFTANTRAS 
GT 

GIAWGRYRDDFIIFALEHFFLRALTGSAPADIARCWHSLAQVLALHPGAFRSVRVAVEGNSSQDSA 
VA 

IATHVHTEMHRILASAGANGPGPELLFYHCEPPGGAVLYPFFLLNKQKTPAFEYFIKKFNSGGVMASQ 
45 EL 
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VSVTVRLQTDFVEYLSEQLNNLIETVSPNTDVR 

IT 

RTS* 

5 

Gene matched: gi | 139646 | sp | P04295 |VTER_HSV11 
Gene name: PROBABLE DNA PACKAGING PROTEIN 

[SEQ ID NO: 234] 
10 ORF # = 19 from Contig 15 
ORF start site = 27251 
ORF end site = 26295 
ORF sequence: 

MITDCFEADIAI PSGI SRPDAAALQRCEGRWFLPTIRRQLALADVAHESFVSGGVSPDTLGLLLiAYR 
15 RR 

FPAVITRVL PTRI VAC PVDLGLTHAGTVNLRNTS PVDLCNGDPVSLVP PVFEGQATDVRLESLDLTLR 
FP 

VPLPTPI^EIVARLVARGIRDLNPDPRTPGELPDLNVLYYNGARLSLVADVQQLASVNTELRSLVLN 
MV 

20 YS ITEGTTLI LTLI PRLLALSAQDGYVNALLQMQSVTREAAQLIHPEAPMLMQDGERRLPLYEALVAW 
LA 

HAGQLGDI LALA PAVRVCTFDGAAWQSGDMAPVI RYP * 

Gene matched: gi | 139191 | sp| P10202 |VP23_HSV11 
25 Gene name: CAPS ID PROTEIN VP23 



[SEQ ID NO:235] 
ORF # = 21 from Contig 15 
30 ORF start site = 32735 
ORF end site = 32067 
ORF sequence: 

MTMRDDVPLLDRELVYEAACGGEDGELPLDEQFSLSSYGTSDFFVSSAYSRLPPHTQPVFSKRWMFA 
WS 

35 FLVLKPLELVAAGMYYGWTGRAVAPACIIAAVLAYYVTWLA^ 
MG 

GAALCALVAAAHETFSPDGLFHWITASQLLPRTDPLRARSLGIACAAGAAMWVAAADCFAAFTNFFLA 
RF 

WTRA I LKAPVAF * 

40 

Gene matched: | 136841 | sp| P10204 | UL20_HSV11 
Gene name: MEMBRANE PROTEIN UL20 

45 
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[SEQ ID NO: 236] 
ORF # = 23 from Contig 15 
ORF start site = 37721 
ORF end site = 35205 
5 ORF sequence: 

MGPGLWVVMGVLVGVAGGHDTYWTEQ I DPWFLHGLGLARTYV^DTNTGRLWLPNTPDASDPQRGRLAP 
PG 

ELNLTTASVPMLRWYAERFCFVLVTTAEFPRDPGQLLYIPKTYLLGRPRNASLPELPEAGPTSRPPAE 
VT 

10 QLKGLSHNPGASALLRSRAWVTFAAAPDREGLTFPRGDDGATERHPDGRRNAPPPG P PAGTPRHPTTN 
LS 

I AHLHNASVTWLAARGLLRTPGRYVYLS PS ASTWPVGVVn?TGGLAFGCDAALVRARYGKGFMGLVI SM 
RD 

S PPAEI I WPADKTLARVGNPTDENAPAVLPGP PAG PRYRVFVLGAPT PADNGSALDALRRVAGYPEE 
15 ST 

NYAQYMSRAYAEFLGEDPGSGTDARPSLFWRLAGLI^SSGFAFVNAAHAHDAIRLSDLLGFLAHSRVL 
AG 

LAARGAAGCAADSWLNVSVLDPAARLRLEARLGHLVAAILEREQSLAAHALGYQLAFVLD^ 

VA 

20 PSAARLIDALYAEFLGGRALTAPMVRRALFYATAVLRAPFLAGAPSAEQRERARRGLLITTALCTSDV 
AA 

ATHADLRAALARTDHQKNLFWLPDHFSPCAASLRFDLAEGGFILDAIJVMATRSDIPADVMAQQT 
SV 

LTRWAHYNALIRAFVPEATHQCSGPSHNAEPRILVPITHNASYWTHTPLPRGIGYKLTGVDVRRPLF 

25 it 

YLTATCEGHAREIEPKRLVRTENRRDLGLVGAVFLRYTPAGEVMSVLLVDTDATQQQLAQGPVAGTPN 
VF 

SSDVPSVALLLFPNGWIHLLAFDTLPIATIAPGFLAASALGVVMITAALAGILRVTOTCVPFLWRRE 
* 

30 

Gene matched: gi| 138315 |sp|P06477|VGLH_HSVll 
Gene name: GLYCOPROTEIN H PRECURSOR 



35 [SEQ ID NO: 237] 

ORF # = 24 from Contig 15 
ORF start site = 39188 
ORF end site = 38058 
ORF sequence: 

40 MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASEARGDPELPTLLRVYIDGPHGVGKTTTSAQ 
LM 

EALGPRDNIVYVPEPMTYWQVLGASETLTNIYNTQHRLDRGEISAGEAAWMTSAQITMSTPYAATDA 
VL 

APHIGGEAVGPQAPPPALTLVFDRHPIASLLCYPAARYLMGSMTPQAVLAFVALMPPTAPGTNLVLGV 
45 LP 



118 



WO 98/20016 



PCT/US97/20016 



ELAEHADRLARRQRPGERLDLAMLS 
AG 

S LPRI EDTLFALFRVPELLAPNGDLYHI FAWHjDVLADRLLPMHLFVLDYDQS PVGCRDALLRLTAGM 
IP 

5 TRVTTAGS I AE I RDLARTFAREVGGV * 



Gene matched: gi | 125438 | sp|P04407 |KITH_HSV23 
Gene name : . THYMIDINE KINASE 

10 

[SEQ ID NO:238) 
ORF # = 25 from Contig 15 
ORF start site = 39090 
15 ORF end site = 39935 
ORF sequence: 

MARTGRRAAVGRPARTSSLTERRRVLLAGVRSHTRFYKAFAREVREFNATRI CGTLLTLMSGSLQGR S 
LF 

EATRVTLICEVDLGPRRPDCICVFEFANDKTLGGVCVILELKTCKSISSGDTASKREQRTTGMKQLRH 
20 SL 

KLLQSLAPPGDKWYLCPILVFVAQRTLRVSRVTRLVPQKISGNITAAVRMLQSLSTYAVPPEPQTRR 
SR 

RRVAATARPQRPPSPTRDPEGTAGHPAPPESDPPSPGWGVAAEGGGVLQKIAALFCVPVAAKSRPRT 
. KT 
25 E* 

Gene matched: gi 1 136854 | sp| P10208 |UL24__HSV11 
Gene name: PROTEIN UL24 

30 

[SEQ ID NO:239] • 
ORF # = 26 from Contig 15 
ORF start site = 40216 
ORF end site = 41973 
35 ORF sequence: 

MDPYYPFDALDVWEHRRFIVADSRSFITPEFPRDFWMLPVFNIPRETAAERAAVLQAQRTAAAAALEN 
AA 

LQAAELPVDI ERRIRPI EQQVHH I ADALEALETAAAAAEEADAARDAEARGEGAADGAAPS PTAGPAA 
AE 

40 MEVQIVRNDPPLRYDTNLPVDLLHMVYAGRGAAGSSG^ 
MS 

KTFMTALVLSLQSCGRLWGQRHYSAFECAVLCLYLLYRTTHESSPDRDRAPVAFGDLIJU^PRYI^ 
LA 

AVIGDESGRPQYRYRDDKIiPKAQFAAAGGRYEHGALATHWIATLVRHGVLPAAPGDVPRDTSTRVNP 
45 DD 
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VAHRDDVNRAAAAFLARGHNLFLWEDQTLLRAT ANT I T ALAVLRRLLANGNVY ADRLDNRLQLGML I P 
GA 

VPAEAIARGASGLDSGAIKSGDNNLEAIjCVNYVLPLYQADPTVELTO 
RR 

5 WDMSSGARQAALVRLTALELINRTRTNTTPVGEI INAHDALGIQYEQGIX3LLAQQARIGLASNAKRF 
AT 

FNVGSDYDLLYFLCLGFI PQYLSVA* 

10 Gene matched: gi | 136863 | sp| P10209 |UL25_HSV11 
Gene name: VIRION PROTEIN UL25 

[SEQ ID NO: 240] 
ORF # = 27 from Contig 15 
15 ORF start site = 42206 
ORF end site = 44179 
ORF sequence: 

MASAEMRERLEAPLPDRAVPIYVAGFLALYDSGDPGELALDPDTVR 
RV 

20 LAWNDPRGPFFVGLIACVQLERVLETAASAAIFERRGPALSREERLLYLITNYLPSVSLSTKRRGDE 
VP 

PDRTLFAHVALCAIGRRIXSTIWYDTSLDAAIAPFRHLDPATO^ 
LT 

HTLLSTAVNNMMLRDRWSLVAERRRQAGIAGHTYLQASEKFKIWGAESAPAPERGYKTGAPGAMDTSP 
25 AA 

SVPAPQVAVRARQVAS SS SS S SSFPAPADMNPVSASGAPAPPP PGDG SYLWI PAFH YNQLVTGQSAPH 
HP 

PLTACGLPAAGTVAYGH PGAGPS PH YP PP PAHP YPGMLFAGPS PLEAQI AALVGAIAADRQAGGLPAA 
AG 

30 DHGIRGSAKRRRHEVEQPEYDCGRDEPDRDFPYYPGEARPEPRPVDSRRAARQASGPHETITALVGAV 
TS 

LQQELAHMRARTHAPYGPYPPVGPYHHPHADTETPAQPPRYPAEAVYLPPPHIAPPGPPLSGAVPPPS 
YP 

PVAVTPGPAPPLHQPSPAHAHPPPPPPGPTPPPAASLPQPEAPGAEAGALVNASSAAHVKRGHGPGRR 

35 sv 

CVTDDGVPLTRLQDPDLGGVCVFI YFK * 

Gene matched: gi | 139233 | sp| P102 10 |VP40_HSV11 
Gene name: CAPSID PROTEIN P40 (VIRION S 

40 

[SEQ ID NO: 241] 
ORF # = 28 from Contig 15 
ORF start site = 47298 
45 ORF end site = 44584 
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ORF sequence: 

MRGGGLICALWGALVAAVASAAPAAPAAPR^GGVAATVAANGGPASRPPPVPSPATTKARKRKTKK 
PP 

KRPEATPPPDANATOAAGHATLRAHLREIKVENADAQFYVCPPPTGATWQFEQPRRCPTRPEGQNYT 
5 EG 

IAWFKENIAPYKFKATMYYKDVTVSQVWFGHRYSQFMGIFEDRAPVPFEEVIDKINAKGVCRSTAKY 
VR 

NNMETTAFHRDDHETDMELKPAKVATRTSRGWHTTDLKYNPSRVEAFHRYGTTVNCIVEEVDARSW 
YD 

10 EFVIATGDFVYMSPFYGYREGSHTEHTSYAADRFKQVDGFYARDL^ 
WD 

WVPKRPAVCTMTKWQEVDEMLRAEYGGSFRFSSDAI STTFTTNLTQYSLSRVDLGDC IGRDAREAIDR 
MF 

ARKYNATHIKVGQPQYYLATGGFLIAYQPLLSNTLAELYVREYMREQDRKPRNATPAPLREAPSANAS 
15 VE 

RI KTTS S I EFARLQFTYNHIQRHVNDMLGR I AVAWCELQNHELTLWNEARKLNPNAI AS ATVGRRVSA 
RM 

LGDVMAVSTCVPVAPDNVIVQNSMRVSSRPGTCYSRPLVSFRYEDQGPLIEGQLGENNELRLTRDALE 
PC 

20 TVGHRRYFIFGGGYVYFEEYAYSHQLSRADVTTVSTFIDLNITMLEDHEFVPLEVYTRHEIKDSGLLD 
YT 

EVQRRNQLHDLRFADIDTVIRADANAAMFAGLCAFFEGMGDLGRAVGKVVMGWGGWSAV 
SN 

PFGALAVGLLVIAGLVAAFFAFRYVLQLQRNPMKALYPLTTKELKTSDPGGVGGEGEEGAEGGGFDEA 
25 KL 

AEAREMI RYMALVSAMERTEHKARKKGTS ALLS SKVTNMVLRKRNKARYS PLHNEDEAGDEDEL * 



Gene matched: gi | 138198 | sp| P06763 |VGLB_HSV23 
30 Gene name: GLYCOPROTEIN B PRECURSOR- gi | 



[SEQ ID NO:242] 
ORF # = 29 from Contig 15 
35 ORF start site = 47122 
ORF end site = 47338 
ORF sequence: 

WAGIX5TGGGREAG PPFAATVAATPPEARGAAGAAGAADATAATSAPTTSAQ I KPPPRMAGLRGRVAP 
AA 

40 R* 



Gene matched: gi (729379 | sp|P3 9055 |DYN1_CAEEL 
Gene name: DYNAMIN»gi | 456286 (L29031) d 

45 
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[SEQ ID NO: 243] 
ORF # = 30 from Contig 15 
ORF start site = 49662 
5 ORF end site = 47305 
ORF sequence: 

MAAAPPAAVSEPTAARQKLIALIX3QVQTWFQLELLRRCDPQIGLGKLAQLKLNALQVRVLRRHLRPG 
LE 

AQAAAFLTPLSVTLELLLEYAWREGERLLGHLETFATTGDVSAFFTETMGLARPCPYHQQIRLETYGG 
10 DV 

RMELCFLHDVENFLKQLNYCHLITPPSGATAALERVREFMVAAVGSGLIVPPELSDPSHPCAVCFEEL 
CV 

TANQGATIARRLADRICNHVTQQAQVRLDANELRRYLPHAAGLSDAARARALCVLDQALARTM 
RA 

1 5 GPPPADSSSVREEADALLEAHDVFQATTPGLYAISELRFWLASGDRARHSTMDAFADNLNALAQRELQ 
QE 

TAAVAVELALFGRRAEHFDRAFGGHLAALDMVDALI IGGQATSPDDQI EALIRACYDHHLTTPLLRRL 
VS 

PEQCDEEALRRVLARLGAGGATGGAEEEEPRAAAEEGGRRRGAGTPASEDGERGPEPGAQGPESWGDI 
20 AT 

RAAADVPERRRLYADRLTKRS LAS LGRCVREQRGELEKMLRVSVHGEVL PATFAAVANGFAARAR FCA 
LT 

AGAGTVIDNRAAPGVFDAHRFMRASLLRHQVDPALLPSITHRFFELVNGPLFDHSTHSFAQPPNTALY 
YS 

25 VENVGLLPHLKEELARFIMGAGGSGADWAVSEFQKFYCFDGVSGITPTQRAAWRYIRELIIATTLFAS 
VY 

RCGELELRRPIX:SRPTSEGLYRYPPGVYLTYNSDCPLVAIVESGPDGCIGPRSVVVYDRDVFSILYSV 
LQ 

HLAPRLAGGGSDAPP* 

30 

Gene matched: gi | 124088 | sp| PI 0212 | PRTP_HSV11 
Gene name: PROCESSING AND TRANSPORT PRO 

35 

[SEQ ID NO: 244] 
ORF # = 31 from Contig 15 
ORF start site » 51666 
ORF end site = 50035 
40 ORF sequence: 

MSLSLDPYTCGPCPLLQLLARRSNIAVYQDLALSQCHGWAGQSVEGRNFRNQFQPVLRRRVMDLFNN 
GF 

LSAKTLTVALSEGAA ICAPSLTAGQTAPAES SFEGDVARVTLGFPKELRVKSRVLFAGAS ANASEAAK 
AR 
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VASLQSAYQKPDKRVDILLX3PLGFLLKQFHAVIFPNGKPPGSNQPNPQWFWTALQRNQLPARLLSRED 
IE 

TIAFIKRFSLDYGAINFINLAPNNVSELAMYYMANQIIjRYCDHSTYFINTLTAVIAGSRRPPGVQAAA 
AW 

5 APQGGAGLEAGARALMDSLDAHPGAWTSMFASCNLLRPVMAARPMVVLGLS I SKYYGMAGNDRVFQAG 
NW 

ASLLGGKNACPLL I FDRTRKFVLACPRAGFVCAAS SLGGGAHEHS LCEQLRGI I AEGGAAVAS SVFVA 
TV 

KSLGPRTQQLQIEDWLALLEDEYLSEEMMEFTTRALERGHGEWSTDAALEVAHEAEALVSQLGAAGEV 
10 FN 

FGDFGDEDDHAASFGGLAAAAGAAGVARKRAFHGDDPFGEGPPEKKDLTLDML* 

Gene matched: gi | 544182 | sp | P3 6384 | DNBI_HSV2 
15 Gene name: MAJOR DNA-BINDING PROTEIN (IN 



[SEQ ID NO:245] 
ORF # = 32 from Contig 15 
20 ORF start site = 53575 
ORF end site = 51701 
ORF sequence: 

MDTKPKTTTTVKVPPGPMGYVYGRACPAEGLELLSLLSARSGDADVAVAPLIVGLTVESGFEANVAAV 
VG 

25 SRTTGLGGTAVSLKLMPSHYSPSVYVFHGGRHLAPSTQAPNLTRLCERARRHFGFSDYAPRPCDLKHE 
TT 

GDALCERLGLDPDRALLYLVITEGFREAVCI SNTFLHLGGMDKVTIGDAEVHRI PVYPLQMFMPDFSR 
VI 

adpfncnhrsigenfnyplpffnrplarllfeawgpaavalrar]wdavaraaahiaf 
30 pa 

ditftafeasqgkpqrgardagnkgpaggfeqrlasvmagdaaij^esivsmawdepppdittwpll 

EG 

QETTPAARAGAVGAYLARAAGLVGAMVFSTNSALHLTEVDDAGPADPKDHSKPSFYRFFLVPGTHVAAN 
PQ 

35 LDREGHVVPGYEGRPTAPLVGGTQEFAGEHLAMIXrGFSPALLAKMLFYLERCIXjGVI 
VA 

DSGOTDVPCNLCTFETRHACAHTTLMRLRARHPKFASAARGAIGVFG 
KR 

A1X3SENTRTIMQETYRAATERVMAELEALQYVDQAVPTALGRLETIIGTREALHTVVNNIKQLV* 

40 

Gene matched: >gi | 544182 | sp| P3 6384 |DNBI_HSV2 
Gene name: MAJOR DNA-BINDING PROTEIN (IN 

45 
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[SEQ ID NO:246] 
ORF # = 33 from Contig 15 
ORF start site = 54393 
ORF end site = 58115 . 
5 ORF sequence: 

MFCAAGGPTSPGGKSAARAASGFFAPHNPRGATQTAPPPCRRQNFYNPHLAQTGTQPKAPGPAQRHTY 
YS 

ECDEFRFIAPRSLDEDAPAEQRTGVHDGRLRRAPKVYCGGDERDVLRVGPEGFWPRRLRLWGGADHAP 
EG 

10 FDPTVTVFHVYDILEHVEHAYSMRAAQLHERFMDAITPA 
NK 

AEVDRHLQCRAPRDLCERLAAALRES PGAS FRG I SADHFEAEVVERADVYYYETRPTLYYRVFVRSGR 
AL 

AYLCDNFC PAIRKYEGGVDATTRFI LDNPGFVTFGWYRLKPGRGNAPAQPRP PTAFGTS SDVEFNCTA 
15 DN 

LAVEGAMCDLPAYKLMCFDIECKAGGEDELAFPVAERPEDLVIQISCLLYDLSTTALEHILLFSLGSC 
DL 

PESHLSDI^SRGLPAPVVLEFDSEFEMLLJU^MTFVKQYGPEFVTGYNIINFDWPFVLTKLTEIYKVP^ 
DG 

20 YGRMNGRGVFRVWDIGQSHFQKRSKI KVNGMVNI DMYG 1 1 TDKVKLSS YKLNAVAEAVLKDKKKDLS Y 
RD 

IPAYYASGPAQRGVIGEYCVQDSLLVGQLFFKFLPHLELSAVARLAGINITRTIYDGQQIRVFTCLLR 
LA 

GQKGFILPDTQGRFRGLDKEAPKRPAVPRGEGERPGDGNGDEDKDDDEDGDEDGDEREEVARETGGRH 
25 VG 

YQGARVLDPTSGFHVDPVWFDFASLYPSIIQAHNLCFSTLSLRPEAVAHLEADRDYLEIEVGGRRLF 
FV 

KAHVRESLLSILLRDWIJVMRKQIRSRIPQSTPEEAVLLDKQQAAIKWCNSVYGFTGVQHGLLPCLHV 
AA 

30 TVTTIGREMLIATRAYVHARWAEFDQLLADFPEAAGMRAPGPYSMRI I YGDTDS I FVLCRGLTAAGLV 
AM 

GDKMASHISRAIJ^LPPIKLECEKTFTKLLLIAKKKYIGVICGGKMLIKGVDLVRKNNCAFINRT^ 
VD 

LLFYDDWSGAAAALAERPAEEWLARPLPEGLQAFGAVLVDAHRRITDP 
35 YT 

NKRLAHLTVYYKLMARRAQVPSIKDRIPWIV^ 
PA 

KRPRETPSHADPPGGASKPRKLLVSEIAEDPGYAIARGVPIOTDYYFSHLLGAACW 
TE 

40 SLLKRFI PETWHP PDDVAARLRAAGFG PAGAGATAEETRRMLHRAFDTLA* 

Gene matched: gi | 118882 | sp | P07918 | DPOL_HSV21 
Gene name: DNA POLYMERASE 

45 
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[SEQ ID NO:247) 
ORF # = 34 from Contig 15 
ORF start site = 58977 
5 ORF end site = 58060 
ORF sequence: 

MYDIAPRRSGSRPGPGRDKTRRRSRFSAAGNPGVERRASRKSLPSHARRLEIXTLHERRRYRGFFAALA 
QT 

PSEEIAIVRSLSVPLVKTTPVSLPFSLIX3TVADNCLTLSGMGYYLGIGGCCPACSAGDGRLATVSREA 
10 LI 

LAFVQQINTIFEHRTFLASLVVLADRHSTPLQDLLADTLGQPELFFW 
YG 

GHMLYVI FPGTSAHLHYRLIDRMLTACPGYRFAAHWQSTFVLVVRRNAEKPADAEI PTVSAADI YCK 
MR 

15 DISFDGGLMLEYQRLYATFDEFPPP* 



Gene matched: gi | 136875 | sp | P10215 | UL31_HSV11 
Gene name: PROTEIN UL31 

20 

[SEQ ID NO:248] 
ORF # = 35 from Contig 15 
ORF start site = 60760 
ORF end site = 58970 
25 ORF sequence: 

MATSAPGVPSSAAVREESPGSSWKEGAFERPYVAFDPDLLALNEALCAELLAA 
VE 

SDVAPAPPRPRGAAREASGGRGPGSARGPPADPTAEGLLDTGPFAAASVDTFALDRPCLVCRTIELYK 
QA 

30 YRLS PQWADYAFLCAKCLGAPHCAAS IFVAAFEFVYVMDHHFLRTKKATLVGSFARFALTINDIHRH 
FF 

LHCCFRTDGGVPGRHAQKQPRPTPSPGAAKVQYSNYSFLAQSATRALIGTLASGGDDGAGAGGGSGTQ 
PS 

LTTALMNWKDCARLLDCTEGKRGGGDSCCTRAAARNGEFEAAAGAIAQGGEPETWAYADL I LLLLAGT 

35 pa 

vwesgprlraaadarraavsesweahrgarmrdaaprfaqfaepkaqpdldi^pi^tvlkhgrgrgr 

TG 

GECLLCNLLLVRAYWIAMRRLRASVWYSENNTSLFIXriVPVVDQLEADPE^ 
PE 

40 AIFKHMFCDPMCAITEMEVDPWVLFGHPRADHRDELQLHKAKLACGNEFEGRVCIALRALIYTFKTYQ 
VF 

VPKPTALATFVREAGALLRRHSISLLSLEHTLCTYV* 

Gene matched: gi | 136879 | sp| P10216 |UL32_HSV11 
45 Gene name: PROBABLE MAJOR ENVELOPE GLYC 
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[SEQ ID NO: 249] 
ORF # = 36 from Contig 15 
5 ORF start site = 60759 
ORF end site = 61151 
ORF sequence: 

MAGRAGRTRPRTLRDAI PDCALRSQTLESLDARYVSRDGAGDAAVWFEDMT PAELEVIFPTTDAKLNY 
LS 

1 0 RTQRLASLLTYAGPI KAPDGPAAPHTQDTACVHGELLARKRERFAAVINRFLDLHQI LRG * 



Gene matched: gi | 136883 | sp|P10217 |UL33_HSV11 
Gene name: UL33 

15 

[SEQ ID NO: 250] 
ORF # = 37 from Contig 15 
ORF start site = 61241 
20 ORF end site = 62071 
ORF sequence: 

MAGMGKPYGGRPGDAFEGLVQRIRLIVPTTLRGGGGESGPYSPSNPPSRCAFQFHGQDGSDEAFPIEY 
VL 

RLMNDWADVPCNPYLRVQNTGVSVLFQGFFNRPHGAPGGAI TAEQTNVI LH STETTGLSLGDLDDVKG 
25 RL 

GLDARPMMASMWI SCFVRMPRVQLAFRFMGPEDAVRTRRI LCRAAEQALARRRRSRRSQDDYGAVAVA 
AA 

HHSSGAPGPGVAASGPPAPPGRGPARPWHQAVQLFRAPRPGPPALLLLVAGLFLGAAIWWAVGARL* 

30 

Gene matched: gi | 136888 | sp| P10218 |UL34_HSV11 
Gene name: VIRION PROTEIN UL34 



35 [SEQ ID NO:251] 

ORF # = 38 from Contig 15 

ORF start site = 62183 

ORF end site = 62521 

ORF sequence: 
40 MAAPQFHRPSTITADNVRALGMRGLVLATNNAQFIMDNSYP 

AN 

NTFAPQPMFAGDAAAEWLRPSFGLKRTYSPFWRDPKTPSTP* 



45 Gene matched: gi | 139196 | sp| P10219 |VP26_HSV11 
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Gene name: CAPSID PROTEIN VP26 



IDtSEQ ID NO:252] 
5 ORF # = 39 from Contig 15 
ORF start site = 72047 
ORF end site = 62688 
ORF sequence: 

MIPAALPHPTMKRQGDRDIWTGVRNQFATDLEPGGSVSCMRSSLSFLSLLFDVGPRDVLSAEAIEGC 
10 LV 

EGGEWTRAAAGSGPPRMCSIIELPNFLEYPAARGGLRCVFSRVYGEVGFFGEPTAGLLETQCPAHTFF 
AG 

PWAMRPLSYTLLTIGPLGMGLYRDGDTAYLFDPHGLPAG 
AG 

15 AMVFFVPSGPGAVAPADLTAAALHLYGASETYLQDEPFVERRVAITHPLRGEIGGLGALFVGWPRGD 
GE 

GSG PWPAL PAPTHVQT PRADR PPEAPRGASGP PNTPQAGHPNRPPDDVWAAALEGT P PAX PSAPDAA 
AS 

GPPHAAPPPQTPAGDAAEEAEDLRVLEVGAVPVGRHRARYSTGLPKRRRPTWTPPSSVEDLTSGERPA 
20 PK 

APPAKAKKKSAPKKKAPVAAEVPASSPTPIAATVPPAPDTPPQSGQGGGDDGPASPSSPSVLETLGAR 
RP 

PEPPGADIAQLFEVHPNVAATAVRLAARDAALAREVAACSQLTINALRSPYPAHPGLLELCVIFFFER 
VL 

25 AFL I ENGARTHTQAGVAGPAAALLDFTLRMLPRKTAVGDFLASTRMSLADVAAHRPL I QHVLDENSQI 
GR 

LALAKLVLVARDVI RETDAFYGDL ADLDLQLRAAPPANLYARLGEWLLERSRAHPNTLFAPAT PTH PE 
PL 

LHRIQAIAQFARGEEMRVEAEAREMREALDALARGVDSVSQR^ 
30 PE 

AIQARLEDVRIQARRAIESAIKEYFHRGAVYSAKALQASDSHDCRFHVASAAWPMVQLLESLPAFDQ 
HT 

RDVAQRAALPPPPPLATSPQAILLRDLLQRGQTLDAPEDLAAWLSVLTDAATQGLIERKPLEEIiARSI 
HG 

35 INDQQARRSSGLAELQRFDALDAALAQQLDSDAAFVPATGPAPYVDGGGLSPEATRMAEDALRQARAM 
EA 

AKMTAEIAPEARSRLRERAHALEAMLNDARERAKVAHDAREKFLHKLQGV^ 
TL 

RASLPAGV^DIADAVRGPPPEVTAALRADLWGLLGQYREALEHPTPOTATALAGLHPA 
40 DA 

PETPVLVQFFSDHAPTI AKAVSNAINAGS AAVATAS PAATVDAAVRAHGALADAVS AliGAAARDPAS P 
LS 

FLAAI^SAAGYVKATRLALE^GAIDELTTLGSAAADLWQARRACAQPEG 
RE 
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SIAGHEAGFGGLLHAEGTAGDHSPSGRALQEIXSKVIGATRRRADELE^VADLTAKMAAQRARGSS^ 
WA 

AGVEAALDRVENRAEFDWELRRLQAIAGra 
EN 

5 QRHPMLPPLAAIHRLGWSAAFHAAAETYADMFRVDAEPIA^^ 
LA 

DDMTSVPGLRRYVPFFQHGYADYVELRDRLDAIRADVHRALX^VPLDLAAAAEQISAARNDPEATAEL 
VR 

TGVTLPCPSEDALVACAAALERVDQSPVKNTAYAEWAFVTO^ 
10 GL 

REAIJUVRERRAQIEAEGLANLKTMLKWAVPAWAKTLDQ 
VI 

WLEHAQRTFETHPLSAARGIXSPGPLARHAGRI^ALFDTRRRVDALRR 
GA 

1 5 WKSPEGFRAMHEQLRALQDTTNWSGLRAQPAYERLSARYQGVLGAKGAERAEAVEELGARVTKHTAL 
CA 

RI^DEVVRRVPWEMNFDALGRLLAEFDAAAADI^PWAVEEFRGARELIQRRMGLYSAYARAGGQTGAG 
AA 

AAPAPLLVDLRALDARARASSSPEGHEVDPQLLRRRGEAYLRAGGDPGPLVLREAVSALDLPFATSFL 

20 ap 

DGTPLQYALCFPAVTDKLGALLMRPEAACWPPLPTDVLESAPTVTAMY\^ 
FQ 

LFGRFVRHRQATWGASMDAAAELYVALVATTLTREFGCRWAQLGWASGAAAPRPPPGPRG S QRHCVAF 
NE 

25 NDVLVALVAGVPEHIYNFWRLDLVRQHET^HLTLERAFEDAAESMLFVQRLTPHPDARIRVLPTFLDG 
GP 

PTRGLLFGTRLADWRRGKLSETDPLAPWRSALELGTQRRDAPAI^KLSPAQAIJ^VSVLGRMCLPS^ 
LA 

'ALWTCMFPDDYTEYDSFDALLAARLESGQTLGPAGGREASLPEAPHALYRPTGQHVAVTJWVTHRTPA 
30 AR 

VTA^LVLAAVLLGAPVWALRIHTAFSR^ 
TD 

LNPI ENACLAAQLPRLSALIAERPLADGPPCLVLVDI SMTPVAVLWEAPEPPGPPDVRFVGSEATEEL 
PF 

35 VATAGDVLAASAADADPFFARAILGRPFDA^LTGELFPGHPVYQRPLADEAGPSAPTAARDPRDIA 
GD 

GGSGPEDPAAPPARQADPGVIAPTLLTDATTGEPVPPRMWAWIHGLEELASDDAGGPTPNPAPALLPP 
PA 

TDQSVPTSQYAPRPIGPAATARETRPSVPPQQNTGRVPVAPRDDPRPSPPTPSPPADAALPPPAFSGS 
40 AA 

AFSAAVPRVRRSRRTRAKSRAPRASAPPEGWRPPALPAPVAPVAASARPPDQPPTPESAPPAWVSALP 
LP 

PGPAS ARGAFPAPTLAP I PPPPAEGAVAPGDDRRRGRRQTTAG PS PTPPRGPAAGPPRRLTRPAVASL 
SA 
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SLNSLPSPRDPADHAAAVSAAAAAVPPSPGLAPPTSAVQTSPPPLAPG.PVAPSEPLCGWWPGGPVAR 
RP 

PPQSPATKPAARTRIRARSVPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQP 
PL 

5 PQPPLPQPPLPQSRDSVPTPESPTHTOTHLPVSAVTSWASSLALHVDSAPPPASLLQTLHISSDDEHS 
DA 

DSLRFSDSDDTEALDPLPPEPHLPPADEPPGPLAADHLQSPHSQFGPLPVQANAVLSRRYVRSTGRSA 
LA 

VLIRACRRIQQQLQRTRRALFQRSNAVLTSLHHVRMLLG* 

0 

Gene matched: gi | 135576 | sp | P10220 |TEGU_HSV11 
Gene name: LARGE TEGUMENT PROTEIN (VIRI 



15 [SEQ ID NO:253] 

ORF # = 40 from Contig 15 
ORF start site = 75699 
ORF end site = 72355 
ORF sequence: 

20 MSDSALQVPAPAGMTPPSAPPPNGPLQVLLGSLTNLRRPPSPSSEPAGSADEPAFLSAAKLRAATAAF 
LL 

SGAAVGPAEARACWHPLLEQLCALHRAHGLPETALLAENLPGLLVHRMAVALPETPEAAFREMDVIKD 
TV 

laitgsdtthaleaaglrttaalx3ptovrqcave 
25 pl 

GQPGANLTTPAYSLLFPSPIVQEGLRFLALVSNWVTLFSAHLQRIDDAALTPLTRALFTLALVDEYLT 
TP 

DRGAWPPPLLAQFQHTVREIDPAIMIPPLEATKMVRSREEVRVSTALSRVSPRSACAPPGTLMARVR 
TD 

30 AAVFDPDVPFL SAS ALAI FRPAVTGLLQLGE P PSAGAQQRLLALLQQTWALVQNSNS PS WINTLTDA 
GF 

TPAHCTQ YI SALEGFLVAGVPARTP PGHGLS EIQQLFGC I ALAGANVFGLAREYGHYAGYVKTFRRIQ 
GA 

SEHTHGRLCEAVGLSGGVLSQTLARIMGPAVPTEHLASLRRTLVGEFETAERRFSAGQPSLLRETALI 
35 WL 

DVYGQTHWDLTPTTPATPLSALLPVGPPSHAPSVHLAAATKIRFPALEGIHPNVIADPGFVPY 
VG 

DALRATCNAAYLPRPI EFALRVLAWARDFGLGYLPTVEGHRTKLGALITLLEPATRAGVG PTMQMADN 
IE 

40 QLLRELYVIARGAVEQLRPAVQLPPPQPPEVGSSLLLISMYALAARGVLQELAERADPLVRQLEDAIV 
LL 

RLHMRTLAAFFECRFESIX3HRLYAWADAHERIX3PWRPEAMG 
W 

TETTAHLGVCDELAAQVSHEGNVLAVVRREIHGFLAIVSGIH^ 
45 RL 
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SAG YQAARAATG PERVAEFVQELHDTWKGLQTERALWAPFAS SADQRTAAI QBVMAHATEDAPPS PA 
AD 

LVVLTNRHDIX3AWGDYSLGPLGQPTWPDSVDLSPQGLAATLSMDWLLINELLQVTTC 
AG 

5 PEAPGDLEAQDAGGSTPEPTTPGPQDTQARAPSTRPAGRETVPWPNTPVEDDEMTPQETPPVHP* 



Gene matched: gi | 136894 | sp| P10221 |V120_HSV11 
Gene name: CAPSID ASSEMBLY PROTEIN UL37 

10 

[SEQ ID NO:254] 
ORF # = 42 from Contig 15 
ORF start site = 78158 
15 ORF end site = 81592 
ORF sequence: 

MANRPAASALAGARSPSERQEPREPE^APPGGDHWCRKVSGVMVLSSDPPGPAAYRISDSSFVQCGS 
NC 

smi i dgdvarghlrdlegatstgaf vai snvaaggdgrtawalggtsgpsattsvgtqtsgeflhgn 
20 pr 

TPEPQGPQAVPPPPPPPFPWGHECCARRDARGGAEKDVGAAESWSDGPSSDSETEDSDSSDEDTGSGS 
. ET 

LSRSSSIWAAGATDDDDSDSDSRSDDSVQPDVWRRRWSDGPAPVAFPKPRRPGDSPGNPGLGAGTGP 
GS 

25 ATDPRASADSDSAAHAAAPQADVAPVLDSQPTVGTDPGYPVPLELTPENAEAVARFLGDAVDREPALM 
LE 

YFCRCAREESKRVPPRTFGSAPRLTEDDFGLLNYALAEMRRLCLDLPPVPPNAYTPYHLREYATRLVN 
GF 

KPLVRRSARLYRIIXSILVHLRIRTREASFEEWMRSKEVDLDFGLTERLREHEAQLMIIA^ 

30 ih 

STPNTLVERGLQSALKYEEFYLKRFGGHYMESVFQMYTRIAGFIACRATRGMRHIALGRQGSWWEMFK 
FF 

FHRLYDHQIVPSTPAMLNLGTRNYYTSSCYLVNPQATTNQATLRAITGNVSAIUU^GGIGI^MQAFN 
DA 

35 S PGTAS IMPALKVLDSLVAAHNKQSTRPTGACVYLEPWHSDVRAVLRMKGVLAGEEAQRCDNI FSALW 
MP 

DLFFKRLIRHLDGEKNVTWSLFDRDTSMSLADFHGEEFE^^ 
TT 

gspfimfkdavnrhyiydtqgaaiagsnlcteivhpsskrssgvcnlgsvnlarcvsrrtfdfgmlrd 
40 av 

QACVLMVNIMIDSTLQPTPQCARGHDNLRSMGIGMQGLHTACLKMGLDLESAEFRDLNTHIAEVMLLA 
AM 

KTSNALCVRGARPFSHFKRSMYRAGRFHWERFSNASPRYEGEWEMLRQSMMKHGLRNSQFIALMPTAA 
SA 
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QISDVSEGFAPLFTNLFSKVTRIX3ETLRPNTLLLKELERTFGGKRLLDAMIX3LEAKQWSVAQALPCLD 
PA 

HPLRRFKTAFDYDQELLIDLCADRAPYVDHSQSOTL 
YC 

5 KVRKATNSGVFAGDDNI VCTSCAL * 

Gene matched: gi | 1710385 | sp| P09853 |RIR1_HSV23 
Gene name: RIBONUCLEOS I DE-DI PHOSPHATE 

10 

[SEQ ID NO: 255] 
ORF # = 43 from Contig 15 
ORF start site = 81665 
ORF end site = 82658 
15 ORF sequence: 

MDPAVSPASTDPLDTHASGAGAAPIPVCPTPERYFYTSQCPDINHLRSLSILNRWLETELVFVGDEED 
VS 

KLSEGELGFYRFLFAFLSAADDLVTENLGGLSGLFEQKDILHYYVEQECIEWHSRVYNIIQLVLFHN 
ND 

20 QARRAYVART INHPAI RVKVDWLEARVRECDS I PEKF ILM I LI EGVFFAASFAAI AYLRTNNLLRVTC 
QS 

NDLISRDEAVHTTASCYIYNNYLGGHAKPEAARVYRLFREA^^ 
EN 

YVRFSADRLLGLIHMQPLYSAPAPDASFPLSLMSTDKHTNFFECRSTSYAGAWNDL* 

25 

Gene matched: gi | 132624 | sp| P03174 |RIR2_HSV23 
Gene name: RIBONUCLEOSIDE-DIPHOSPHATE R 

30 

[SEQ ID NO: 256] 
ORF # = 44 from Contig 15 
ORF start site = 84014 
ORF end site = 82941 
35 ORF sequence : 

MRRRGPiAFAPGDRGTRAAGPGPAAPWGAPSKPALRLAHLFCIRVLRALGYAYINSGQLEADDACANLY 
HT 

NTVAYVHTTDTDLLLMGCDIVLDISTGYIPTIHCRDLLQYFTO 
ED 

40 VLRECHOTAPSRSQARRAARRERANSRSLESMPTLTAAPVGLETRISWTEILAQQIAGEDDYEEDPPL 
QP 

PDVAGGPRDGARSSSSEILTPPELVQVPNAQRVAEHRGYVAGRRRHVIHDAPEALDWLPDPMTIAELV 
EH 

RYVKYVISLISPKERGPWTLLKRLPIYQDLRDEDLARSIVTRHITAPDIADRFLAQLWAHAPPPAFYK 
45 DV 
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LAKFWDE* 



Gene matched: gi | 549322 | sp| P3 6699 |VHS_HSV2G 
5 Gene name: VIRION HOST SHOTOFF PROTEIN 



[SEQ ID NO:257] 
ORF # = 45 from Contig 15 
10 ORF start site = 84914 
ORF end site = 86326 
ORF sequence: 

MAHLPGGAAAAPLSEDAIPSPRERTEDWPPCQIVLQGAELNGILQAFAPLRTSLLDSLLWGDRGILV 
HN 

15 AIFGEQWLPLDHSQFSRYRWGGPTAAFLSLVDQKRSLLSVFTIANQYPDLRRVELTVTGQAPFRTLVQ 
RI 

WTTASDGEAVELAS ETLMKRELTS FAVLLPQGDPDVQLRLTKPQLTKWNAVGDETAK PTTFELGPNG 
KF 

SWNARTCVTFAAREEGASSSTSAQVQILTSALKKAGQAAANAKTVYGENTHRTFSVVVDDCSMRAVL 
20 RR 

LQVGGGTLKFFLTADVPSVCVTATGPNAVSAVFLLKPQRVCLNWLGRTPGSSTGSLASQDSRAGPTDS 
QD 

FSSEPDAGDRGAPEEEGLEGQARVPPAFPEPPGTKRRHAGAEWPADDATKRPKTGVPAAPTRAESPP 
LS 

25 ARYGPEAAEGGGDGGRYACYFRDLQTGDAS PSPLSAFRGPQRPPYGFGLP * 



Gene matched: gi | 136905 | sp | P10226 |VPAP_HSV11 
Gene name: DNA POLYMERASE PROCESS I VITY 

30 

[SEQ ID NO:258] = 15 
ORF # = 48 from Contig 15 
ORF start site = 89794 
35 ORF end site = 90312 
ORF sequence: 

MAFRASG PAYQPLAPAAS PARARVPAVAWIGVGAI VGAFALVAALVLVPPRS SWGLS PCDSGWQEFNA 
GC 

VAVTOPTPVEHEQAVGGCSAPATLIPRAAAKHLAALTRVQAERSSGYWWVNGDGIRTC 
40 EF 

CEELAI RIC YYPRS PGGFVRFVTS I RNALGL P* 

Gene matched: gi| 136917 | sp| P06483 | UL45_HSV23 
Gene name: PROTEIN UL45 HOMOLOG {18 KD 

45 
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[SEQ ID NO:259) 
ORF # = 49 from Con tig 15 
ORF start site = 92744 
5 ORF end site = 90579 
ORF sequence: 

MQRRARGASSLRLARCLTPANLIRGANAGVPERRIFAGCLLPTPEGLLSAAVGVLRQRADDLQPAFLT 
GA 

drsvriaarhhnwpeslivtclasdphydy 
10 vp 

sgldipddpagixdpslhvllrptllpkllvrapfksgaaaakyaaavaglrdaahrlqqymffmrpa 

DP 

SRPSTDTALRLSEFLAWSVLYHWASWMLOTADKYVCRRLGPADRRFVALSGSLEAPAETFARHLDRO 
PS 

15 GTTGSMQCMAIJIAAVSDVLGHLTRLAHLWETGKRSGGTYG I VDAI VSTVEVLS IVHHHAQY I INATLT 
GY 

VWASDSI^EYLRAAVDSQERFCRTAAPLFPTMTAPSWARMELSIKSWFGAALAPDLLRSGTPSPHY 
ES 

ILRLAASGPPGGRGAVGGSCRDKIQRTRRDNAPPPLPRARPHSTPAAPRRFRRHREDLPEPPHVDAAD 
20 RG 

PEPCAGRPATYYTHMAGAPPRLPPRNPAPPEQRPAAAARPLAAQREAAGVYDAVRTWGPDAEAEPDQM 
EN 

TYLLPDDDAAMPAGVGLGATPAADTTAAAWPAESHAPRAPSEDADS I YESVSEDGGRVYEEI PWVRVY 
EN 

25 ICLRRQDAGGAAPPGDAPDSPYIEAENPLYDWGGSALFSPPGATRAPDPGLSLSPMPARPRTNALAND 
GP 

TNVAALSALLTKLKRGRHQSH * 



30 Gene matched: gi 1 114350 | sp| P10230 |ATI2_HSV11 
Gene name: LPHA TRANS -INDUCING FACTOR 



[SEQ ID NO:260J 
35 ORF # s 50 from Contig 15 
ORF start site = 93910 
ORF end site = 92828 
ORF sequence: 

VGAAAVPLLSAGGAAPPHPGPDAAVFRSSLGSLLYWPGVRALI^ 
40 FN 

PGAVKCVLPREAAFAGRVLDVIAVLAEQTVQWL^ 
W 

AAEHEALGDTAARRLIJVTSGLNAVLGAAVYALHTALATOT 
IL 
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QRLI^LADTWACVAIAAFDGGST 
RP 

ASSAPRAPVSGTADPAFLLEDLAAFPPAPLNSESVLGPRVRVVDIMAQFRKLLMGDEETAALRAHVSG 
RR 

5 ATGLGGPPRP * 



Gene matched: gi | 13 6920 | sp| PI 0231 |UL47_HSV11 
Gene name: . VIRION PROTEIN UL47 

10 

[SEQ ID NO:261] 
. ORF # = 51 from Contig 15 
ORF start site = 94919 
15 ORF end site = 93504 
ORF sequence: 

MSVRGHAVRRRRASTRSHAPSAHRADSPVEDEPEGGGVGLMGYLRAVFNVDDDSEVEAAGEMASEEPP 
PR 

RRREARGHPGSRRASEARAAAPPRRASFPRPRSVTARSQSVRGRRDSAITRAPRGGYLGPMDPRDVLG 
20 RV 

GGSRWPSPLFLDELSYEEDDYPAAVAHDDGAGARPPATVEILAGRVSGPELQAAFPLDRLTPRVAAW 
DE 

SWSALALiGHPAGFYPCPDSAFGLSRVGVMHFASPADPKVFFRQTLQQGEALAWYVTGDAILDLTDRR 
AK 

25 TSPSRAMGFLVDAIVRVAINGWVCGTRLHTEGARLGARRQGGRAPTAVREPHGVAARGGRRRAAAQRG 
RG 

RAPP PRPRRRGLSQFAGVPAVLARGARAPGARL SRGRPLRGAHDVHRHRGS ARPLQ PRRRQMRAPAGG 
RV 

CGARPGRAGGPGGADGPVALGGRGGAPAPALRPPRVCGRGAGGAVSRPAPG* 

30 

Gene matched: gi | 136920 | sp| P10231 |UL47_HSV11 
Gene name: VIRION PROTEIN UL47 

35 

[SEQ ID NO: 262) 
ORF # = 53 from Contig 15 
ORF start site = 98257 
ORF end site = 97349 
40 ORF sequence: 

MTSRRSVKSCPREAPRGTHEELYYGPVSPADPESPRDDFRRGAGPMRARPRGEVRFLHYDEAGYALYR 
DS 

SSSEDNDESRDTARPRRSASVAGSHGPGPARAPPPPGGPVGAGGRSHAPPARTPKMTRGAPKAPATPA 
TD 
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PARGRRPAQADSAVLLDAPAPTASGRTKTPAQG^ 
RL 

AATHARLAAVQLWDMSRPHTDEDI^ELIJDLTTI^ 
PA 

5 GRAAATARAPARSASRPRRPLE* 



Gene matched: gi 1 136927 | sp| P10233 |UL49_HSV11 
Gene name: TEGUMENT PROTEIN UL49 

10 

[SEQ ID NO: 263] 
ORF # » 54 fro Contig 15 
ORF start site = 98876 
15 ORF end site = 98596 
ORF sequence: 

WLLFVALVAGVPGEPPNAAGARGVIGDAQCRGDSAGWSVPGVLVPFYLGMTSMGVCMIAHVYQICQ 
RA 

LAAGSA* 

20 

Gene matched: gi 1 1944541 1 gnl | PID|e312365 
Gene name: (X14112) envelope protein [human 

25 

[SEQ ID NO: 264] 
ORF # = 55 from Contig 15 
ORF start site = 98867 
ORF end site = 99976 
30 ORF sequence: 

MSQWGPRAILVQTDSTNRNADGDWQAAVAI RGGGWQLNMVNKRAVDFTPAECGDSEWAVGRVS LGLR 
MA 

MPRDFCAIIHAPAVSGPGPHVMLGLVT)SGYRGTVIjAvVVAP 
TE 

35 PSSLHRFPQI^PSPIJVGLREDPWLDGALATAGGAVALPARRRGGSLVYAGELTQVTTEHGDCVHEAPA 
FL 

PKREEDAGFDI LIHRAVTVPANGAWIQPSLRVIjRAADGPEACYvIjG 
CA 

FWCNLTGVP VTLQAGSKVAQLLVAGTHALPWI PPDNIHEDGAFRAYPRGVPDATATPRDPPI LVFTN 
40 EF 

DADAPPSKRGAGGFGSTGI * 

Gene matched: gi | 118955 | sp| P10234 |DUT_HSV11 
45 Gene name: DEOXYURIDINE 5 ' -TRIPHOSPHATE 
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[SEQ ID NO: 265] 
ORF # = 56 from Contig 15 
5 ORF start site = 101006 
ORF end site = 100182 
ORF sequence: 

MASLLGVLC<^GTRPEEQQYEMIRAAAPPSXXDPRLQ 
AR 

10 ALARTYHACMVNLERLARHH PGLEGSTI IX5AVAAHRDKMRRLADTCMATI LQMYMSVGAADKS ADVLV 
SQ 

AIRSMAESDVVMEDVAIAERALGLSTSALAGGTRTAGLGATEAPPGPTRAQAPEVASVPVTHAGDRSP 
VR 

PGPVPPADPTPDPRHRTSAPKRQASSTEAPLLLA* 

15 

Gene matched: gi 1 136933 | sp| P10235 |UL51_HSV1 
Gene name: PROTEIN UL51 

20 [SEQ ID NO: 2 66] 

ORF # = 58 from Contig 15 
ORF start site = 102815 
ORF end site = 104188 
ORF sequence : 

25 MYVNRNEI FNAALAVTNI ILDLDI ALKEPVPFPRLHEALGHFRRGALAAVQLLFPAARVDPDAYPCYF 
FK 

SACRPRAPPVCAGDGPSAGGDDGDGDWFPDAGGDDGDEEWEEDTDPMDTTHGPLPDDEAAYLDLLHEQ 
IP 

AATPSEPDSWCSCADKIGLRVCLPVPAPYWHGSLTMRGVARVIQQAVLLDRDFVEAVGSHVKNFLL 
30 ID 

TGVYAHGHSLRL PYFAKIGPDGSACGRLLPVFVI PPACEDVPAFVAAHADPRRFHFHAPPMFSAAPRE 
IR 

VLHSLGGDYVSFFEKKASRNALEHFGRRETLTEVLGRYDVRPDAGETVEGFASELLGRIVACIEAHFP 
EH 

35 AREYQAVSVRRAVI KDDWVLLQLI PGRGALNQSLSCLRFKHGRASRATARTFLALS VGTNNRLCASLC 
QQ 

CFATKCDNNRLHTLFTVDAGTPCSRSAPSSTSRPSSS * 



40 Gene matched: gi 1 136939 1 sp| PI 0236 |UL52_HSV11 
Gene name: DNA HELICASE / PRIMASE COMPLEX 



[SEQ ID NO: 267] 
45 ORF # = 59 from Contig 15 
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ORF start site = 104140 
ORF end site = 105156 
ORF sequence: 

MLAVRS LQHLTTVI FITAYGLVLAWYI VFGAS PLHRC IYAVRPAGAHNDTALVWMKINQTLLFLGPPT 
5 AP 

PGG AWT PHAHVC YAN 1 1 EGRAVSLPAI PGAMSRRVMNVHEAWCLEALWDTQMRLVVVGWFLYLAFVA 
LH 

QRRCMFGWSPAHSWAPATYLLNYAGRIVSSVFLQYPYTKITRLLCELSVQRQTLVQLFEADPVTFL 
YH 

10 RPAVGVIVGCELLLRFVALGLI VGTAL I SRGACAITYPLFLTITTWCFVS I IALTEL YFILRRDSAPK 
NA 

EPAAPRGRSKGWSGVCGRCCS I 1 LSGI AVRLCYI AVVAGWLMALRYEQEIQRRLFDL * 



15 Gene matched: gi | 116105 | sp| P22485 |CELF_HSV2H 
Gene name: CELL FUSION PROTEIN PRECURSO 



[SEQ ID NO:268] 
20 ORF # = 60 from Contig 15 
ORF start site = 105702 
ORF end site = 107240 
ORF sequence: 

MATDIDMLIDLGLDLSDSELEEDALERDEEGRRDDPESDSSGECSSSDEDMEDPCGDGGAEAIDAAIP 
25 KG 

PPARPEDAGTPEASTPRPAARRGADDPPPATTGVWSRLGTRRSASPREPHGGKVARIQPPSTKAPHPR 
GG 

RRGRRRGRGRYGPGGADSTPNPRRRVSRNAHNQGGRHPASARTDGPGATHGEARRGGEQLDVSGGPRP 
RG 

30 TRQAPPPLMALSLTPPHADGRAPVPERKAPSADTIDPAVRAVLRS I SERAAVERI SESFGRSALVMQD 
PF 

GGMPFPAANSPWAPVIATQAGGFDAETRRVSWETLVAHGPSLYRTFAANPRAASTAKAMRDCVLRQEN 
LI 

EALASADETLAWCKMC IHHNLPLRPQDPI IGTAAAVLENLATRLRPFLQCYLKARGLCGLDDLCSRRR 
35 LS 

DIKDIASFVLVILARLANRVERGVSEIDYTWGVGAGETMHFYIPGACMAG^ 
EL 

TASHTIAPLYVHGKYFYCNSLF * 

40 Gene matched: gi | 124181 | sp| P28276 | IE63_HSV2H 
Gene name: TRANSCRIPTIONAL REGULATOR IE 



ORF # = 62 
45 ORF start site = 108542 
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ORF end site = 108189 
ORF sequence: 

MIGAHPGVGGDLPSGLPTYAEATSDRPPTYAMV^ 
EL 

5 YRAQRAARCASSSDTPQAPGWCGGTCRHAVFGWAWWI ILAFLWR* 

Gene matched: gi | 136952 | sp| P28282 |UL56_HSV2H 
Gene name: PROTEIN UL56*gi | 73833 |pir | |W 

10 

[SEQ ID NO: 269] 
ORF # = 63 from Contig 15 
ORF start site = 112958 
15 ORF end site = 113542 
ORF sequence: 

MHLFCQCPLTIX5QDLYLCPWPRMHQEHLVCPLHRLDDARRRGRTSAAWDEGLVRALTHSGGLMGCGG 
RS 

LTLSETYWGHPLYEKLVPWDHPRDLKVPEASAVGTRALVPRGRGRPLRGRPVPLIPLDCEPNDGLPFG 
20 GG 

WPGGRLRGAPVPLHPPPPSAPPLSFTPTLTPPCLCRGLSLCVWKQYLKDRNNF* 

Gene matched: gi | 1644457 
25 Gene name: (U72521) neural variant mena+ protein [Mus muscu 



30 



TABLE 4 

35 All amino acid sequences within Table 4 are encoded by Contig 15 
[SEQ ID NO: 217] of Table 3 

[SEQ ID NO: 270] = 15 
ORF # = 9b 
40 ORF start site = 14508 
ORF end site = 11905 
ORF sequence: 

MNVATCTHQTHHAARAPGATSAPGAASGDPLGARRPIGDDECEQYTSSVSLARMLYGGDLAEWV^ 
PK 
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TTIERQQHGPWFPDASAPTARCVTVVRAPMGSGKTTALIRWLGEAIHSPDTSVLWSCRRSFTQTIA 
TR 

FAESGLPDFVTYFSSTOTIMNDRPFHRLIVQVESLHRVGPN^ 
LG 

5 RVDALMLRLLRTCPRI IAMDATANAQLVDFLCSLRGEKNVHWIGEYAMPGFSARRCLFLPRLGPEVL 
QA 

ALRPPGPAGGAPPPDAPPDATFFGELEARLAGGDNVCIFSSTVSFAEWARFCRQFTDRVLLLHSLTP 
PG 

DVTTWGRYRVVIYTTVVTVGLSFDPPHFDSMFAYVKPMNYGPDMVSVYQ 
10 SG 

ARSEPWTPMLIJmWSASGQWPAQFSQVTOLI^ 
CT 

LACLADSLNILHMLLTLNCMHTOFWGHDAALTPRNFCLFLRGIHFDALRAQRDLRELRCQDPDTSL 
QA 

15 AETEEVGLFVEKYLRPDVAPAEWALMRGLNSLVGRTRFIYLVLLEACLRVPMAAHSSAIFRRLYDHY 
AT 

GVI PTINAAGELELVALHPTLNVAPVWELFRLCSTMAACLQWDSMAGGSGRTFS PEDVLELLNPHYDR 
YM 

QLVFELGHCNVTDGPLLSEDAVKRVADALSGCPPRGSVSETEHALSLFKIIWGELFGVQLAKSTQTFP 
20 GA 

GRVKNLTKRAIVELLDAHRIDHSACRTHRQLYALLMAHKREFAGARFKLRAPAWGRCLRTHASGAQPN 
TD 

1 1 LEAALS ELPTEAWPMMQGAVNFSTL * 

25 

Gene matched: gi | 1869831 | gnl | PID | e304265 
Gene name: (Z86099) UL9 [human herpesvirus 

[SEQ ID NO: 271] = 15 
30 ORF # = 9a 

ORF start site = 14520 
ORF end site = 11904 
ORF sequence: 

MAETMNVATCTHQTHHAARAPGATSAPGAASGDPLGARRP IGDDECEQYTS SVSLARMLYGGDLAEWV 
35 PR 

VliPKTTIERQQHGPVTFPDASAPTARCVTWRAPMGSGKTTALI^ 
QT 

LATRFAESGLPDFVTYFSSTNYIMNDRPFHRLIVQVESLHRVGPNLLNNTO 
TM 

40 QQLGRVDAIfMLRLLRTC PRI IAMDATANAQLVDFLCSLRGEKNVHWIGEYAMPGFSARRCLFLPRLG 
PE 

VLQAALRPPGPAGGAPPPDAPPDATFFG ELEARLAGGDNVC I FS STVSFAEWARFCRQFTDRVLLLH 
SL 

TPPGDWTWGRYRWIYTTWIVGLSFDPPHFDSMFAYW 
45 YM 
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DGSGARSEPVFTPMLLNHWSASGQWPAQFSQVTNLLCRRFKGRCDASHADAAQARGSRIYSKFRYKH 
YF 

ERCTLACLADSI^ILHMLLTLNCMH^ 
SL 

5 SAQAAETEEVGLFVEKYLRPDVAPAEWALMRGLNSLVGRTRFIYLVLLEACLRVPMAAHSSAIFRRL 
YD 

HYATGVI PT INAAGELELVALHPTLNVAPVWELFRLCSTMAACLQWDSMAGGSGRTFS P EDVLELLNP 
HY 

DRYMQLVFEI^HCNVTIXSPLLSEDAVKRVADALSGCPPRGSVSETEHALSLFKIIWGELFGVQLAKST 
10 QT 

F PGAGRVKNL.TKRA I VELLDAHRIDHSACRTHRQLYALLMAHKREFAGA^ 
AQ 

PNTDI ILEAALSELPTEAWPMMQGAVNFSTL * 

15 

Gene matched: gi | 13 6806 | sp | P10193 | 0BP_HSV11 
Gene name: ORIGIN OF REPLICATION BINDING 



20 [SEQ ID NO: 272] = 15 
ORF # = 20a 

ORF start site = 31782 
. ORF end site = 27630 
ORF sequence: 

25 MEPANPPRNPMAAPARDPPGYRYAAAMVPTGSILSTIEVASHRRLFDFFARVRSDENSLYDVEFDALL 
GS 

YCNTLSLWFLEIX3LSVACVCTKFPELAYMNEGRVQFEVHQPLIARDGPHPVEQPVHNYMTKVIDRRA 
LN 

AAFSLATEAIALLTGEALDGTGISLHRQLRAIQQLARW 

30 LP 

MQRYLDNGRLATRVARATLVAELKRSFCDTSFFLGKAGHRREAIEAWLVDLTTATQPSVAVPRLTHAD 
TR 

GRPVDGVLVTTAAIKQRLLQSFLKVEDTEADVPVTYGEMVLNGANLVTALVMGKAVR 
MQ 

35 EEQLEANRETLDELESAPQTTRVRADLVAIGDRLVFLEALEKRIYAATNVPYPLVGAMDLTFVLPLGL 
FN 

PAMERFAAHAGDLVPAPGHPEPRAFPPRQLFFWGKDHQVLRLSMENAVGTVCHPSLMNIDAAVGGYtra 
DP 

VEAANPYGAYVAAPAGPGADMQQRFLNAWRQRLAHGRWWAECQMTAEQFMQPDNANLAL 
40 FF 

AGVADVELPGGEVPPAG PGAIQATWRWNGNLPLALC PVAFRDARGLELGVGRHAMAPATI AAVRGAF 
ED 

RSYPAVFYLLQAAIHGSEHVFCAIJUtfjWQCITSYWNNTRCAAFW 
VY 
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RDLVAHVEALAQLVDDFTLPGPELGGQAQAEI^ 
GG 

HEPVYAAACNVATADFNRNIXSRLLHNTQARA^ 
GV 

5 RFDRVYATLQNMVVPEIAPGEECPSDPWDPAHPLHPANLVANT^ 
HN 

MAERTTAI^CSAAPDAGANTASTANMRI FDGALHAGVLLMAPQHLDHTI QNGEYF YVLPVHALFAGM 
HV 

ANAPNFPPALRDLARHVPLVPPALGANYFS S I RQPWQHARES AAGENALTYALMAGYFKMS PVAL YH 
10 QL 

KTGLHPGFGFTVVRQDRFVTENVLFSERASEAYFLGQLQVARHETGGGVSFTLTQPRGNVDLGVGYTA 
VA 

ATATVRNPVTDMGNLPQNFYLGRGAPPLLDNAAAVYLRNAWAGNRLGPAQPLPVFGCAQVPRRAGMD 
HG 

1 5 QDAVCEFI ATPVATDINYFRRPCNPRGRAAGGVYAGDKEGDVI ALMYDHGQSDPARPFAATANPWASQ 
RF 

SYGDLLYNGAYHLNGASPVLSPCFKFFTAADITAKHRCLERLIVETGSAVSTATAASDVQFKRPPGCR 
EL 

VEDPCGLFQEAYPITCASDPALLRSARDGEAHARETHFTQYLIYDASPLKGLSL* 

20 

Gene matched: gi 1 137571 1 sp| PO 6491 |VCAP_HSV11 
Gene name: MAJOR CAPSID PROTEIN (MCP) 



25 [SEQ ID NO: 273] =15 
ORF # = 20b 
ORF start site = 31754 
ORF end site = 27630 
ORF sequence: 

30 MAAPARDP PGYRYAAAMVPTGS IL STI EVASHRRLFDFFARVRSDENSLYDVEFDALLGSYCNTLSLV 
RF 

LEIX3LSVAOTCTKFPELAYMNEGRVQFEVHQPLIARDGPH 
AI 

ALLTGEALDGTGISLHRQLRAIQQIARNVQAVLGAFERGTA^ 
35 RIi 

ATRVARATLVAELKRS FCDTSFFLGKAGHRREAI EAWLVDLTTATQPSVAVPRLTHADTRGRPVDGVL 
VT 

TAAIKQRLLQSFLKVEDTEADVPWYGEMVLNGANLOTALV^ 
ET 

40 LDELESAPQTTRVRADLVAIGDRLWLEALEKRIYAATNVPYPLVGAMDLTFVLPLGLFNPAMERFAA 
HA 

GDLVPAPGHPEPRAFPPRQLFFWGKDHQVLRLSMENAVGTVCHPSM 
AY 

vaapagpgadmqqrflnawrqrij^grvrwaecqmtaeqfmqpdnanlalelhpafdffagvadvel 
45 pg 
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GEVP PAGPGAIQATWRWNGNLPLALC PVAFRDARGLEIX3VGRHAMAPATIAAVRGAFEDRS YPAVFY 
LL 

QAAIHGSEHVFCALARLVTQCITSYWNNTRCAAFVNDYSLVSYI 
AL 

5 AQLVDDFTLPGPEIXSGQAQAELNHLMRDPALLPPLVWTCDGIoMRHAAI^ 
CN 

VATADFNRNIX3RLLHNTQARAADAADDRPHRPADWTVHHKIYYYVLVPAFSRGRC 
LQ 

NMVVPEIAPGEECPSDPVTDPAHPLHPANLVANTVNAMFHNGRVVVIX3P 
10 LC 

SAAPDAGANTASTANMRIFDGALHAGVLLMAPQHLDHTIQNGEYFYVLPVHALFAGADHVANAPNFPP 
AL 

RDLARHVPLVPPALGANYFSS IRQPWQHARESAAGENALTYALMAGYFKMS PVALYHQLKTGLHPGF 
GF 

1 5 TWRQDRFVTENVLFSERASEAYFLGQLQVARHErTGGGVSFT^ 
VT 

DMGNLPQNFYLGRGAPPLLDNAAAVYLRNAWAGNRLGPAQPLPVFGCAQVPRRAGMDHGQDAVCEFI 
AT 

PVATDINYFRRPCNPRGRAAGGVYAGDKEGDVIALMYDHGQSDPARPFAATANPWASQRFSYGDLLYN 
20 GA 

YHLNGASPVLSPCFKFFTAADITAKHRCLERLIVETGSAVSTATAASDVQFKRPPGCRELVEDPCGLF 
QE 

AYPITCASDPALLRSARDGEAHARETHFTQYLIYDASPLKGLSL* 

25 Gene matched: gi 1 137571 1 sp| P06491 |VCAP_HSV11 
Gene name: MAJOR CAPSID PROTEIN (MCP) 

[SEQ ID NO: 274) =15 
30 ORF # = 22a 

ORF start site = 33002 
ORF end site = 34984 
ORF sequence: 

MRPELSLKGRPCVTEAWCPSTDAAIHSGGSSSVRPQPYARAARARATHGSRSRHRQPLLPPPSSHHP 
35 TI 

PPPPS PPRGS PAMELSYATTLHHRDWFYVTADRNRAYFVCGGSVY SVGRPRDSQPGEI AKFGLWRG 
TG 

PKDRMVANYVRSELRQRGLRDVRPVGEDEWLDSVCLLNPNVSSERD^ 
RT 

40 SPGVLVTGVRVRARDRVIELFEHPAIVNISSRFAYTPSPYVFALAQAHLPRLPSSLEPLVSGLFDGIP 
AP 

RQPLDARDRRTDWI TGTRAPRPMAGTGAGGAGAKRATVSEFVQVKH I DRWS PS VSS APPPSAPDAS 
LP 

PPGLQEAAPPGPPLRELWWVFYAGDRALEEPHAESGLTREEVRAVHGFREQAWKLFGSVGAPRAFLGA 
45 AL 
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ALSPTQKLAVYYYLIHREREMSPFPALVRLVGR^ 
EQ 

LLMFDLLPPKDVPVGSDARADSAALLRFVDSQRLTPGGSVSPEHVMYLGAFLGVLYAGHGR^ 
AR 

5 LTGVTSLVLTVGDVDRMSAFDRGPAGAAGRTRTAGYLDALLTVCLARAQHGQSV* 

Gene matched: gi | 13 6845 | sp | P10205 |UL21_HSV11 
Gene name: PROTEIN UL21 

10 

{SEQ ID NO:275] = 15 
• ORF # = 22b 
ORF start site = 33385 
ORF end site = 34984 
15 ORF sequence: 

MELSYATTLHHRDVVF^ADRNRAYFVCGGSVYSVGRPRDSQPGEIAKFGLVVRGTGPKDRMVANYV 
RS 

ELRQRGLRDTOPVGEDE^LDSVCLI^PNVSSERDVINTNDVEVLDECLAEYCTSLRTSPGVLVTGVR 
VR 

20 ARDRVIELFEHPAIVNISSRFAYTPSPYVFALAQAHLPRLPSSLEPLVSGLFDGIPAPRQPLDARDRR 
TD 

WITGTRA PR PMAGTGAGGAGAKRATV S EFVQVKHI DRWS PSVS S APP PS APDASL PPPGLQEAAPP 
GP 

PLRELWWWYAGDRALEEPHAESGLTREEVRAVHGFREQAV^LFGSVGAPRAFLGAALALSPTQKLAV 
25 YY 

YLIHRERRMSPFPALVRLVGRYIQRHGLWPAPDEPTLADAMNGLFRDALAAGTVAEQLLMFDLLPPK 
DV 

PVGSDARADSAALLRFVDSQRLT PGGSVS PEHVMYLGAFLGVLYAGHGRLAAATHTARLTGVTSLVLT 
VG 

30 DVDRMSAFDRGPAGAAGRTRTAGYLDALLTVCLARAQHGQSV* 

Gene matched: gi| 136845 |sp|Pl0205| UL21_HSV11 
Gene name: PROTEIN UL21 

35 

[SEQ ID NO:276] = 15 
ORF # = 41a 

ORF start site = 75756 
ORF end site = 77588 
40 ORF sequence: 

MTAAALYGGAKYRPGTLRNPGRVASTPRRRGVLYGALCPGI PFVGSGPGAVGWECVCVGGGRRDGG PD 
QV 

YRGRSVGRPNRPFKHLRMHRPSQSDTGTHQRRKPPSPVRVRVFSGGVFFLSALLPPHLHHPPPTTRPL 
AI 
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GGKTMKTKPLPTAPMAWAES AVETTTS PRELAGHAPLRRVLRP P I ARRDG PVLLGDRAPRRTASTMWL 
LG 

I DPAESS PGTRATRDDTEQAVDKI LRGARRAGGL WPGAPRYHLTRQVTLTDLCQPNAERAGALLLAL 
RH 

5 PTDLPHLARHRAPPGRQTERLAEAWGQLLEASALGSGRAESGCARAGLVSFNFLVAACAAAYDARDAA 
EA 

VRAHITTNYGGTRAGARLDRFSECLRAMVHTHVFPHEVMRFFGGLVSWVTQDEIASVTAVCSGPQEAT 
HT 

GHPGRPCSAVTIPACAFVDLDAELCLGGPGAAFLYLVFTYRQCRDQELCCVYVVKSQLPPRGLEAALE 
10 RL 

FGRLRITNTI HGAEDMT PPPPNRNVDFPLAVLAAS SQS PRC SASQVTNPQFVDRLYRWQPDLRGRPTA 
RT 

CT YAAFAELGVM PDNS PRCLHRTERFGAVG VP WI LEG WWR PGGWRAC A * 

15 Gene matched: gi | 139176 | sp| P22486 |VP19_HSV2G 
Gene name: CAPSID ASSEMBLY AND DNA MATU 



[SEQ ID NO:277] = 15 
20 ORF # = 41b 

ORF start site = 75817 
ORF end site = 77588 
ORF sequence: 

MHRPSQSDTGTHQRRKPPSPVRVRVFSGGVFFLSALLPPHLHHPPPTTRPLAIGGKTMKTKPLPTAPM 
25 AW 

AESAVETTTSPREIAGHAPLRRVLRPPIARRIX3PVLLGDRAPRRTASTMWLLGIDPAESSPGTRATRD 
DT 

EQAVDKII^GARRAGGLTVPGAPRYHLTRQVTLTDLCQPNAERAGALLLALRHPTDLPHIJ^RHRAPPG 
RQ 

30 TERLAEAWGQLLEASALGSGRAESGCARAGLVS FNFLVAACAAAYDARDAAEAVRAHITTNYGGTRAG 
AR 

LDRFSECLRAMVHTHVFPHEVMRFFGGLVSWVTQDELASVTAVCSGPQEATHTGHPGRPCSAVTIPAC 
AF 

VDLDAELCLGGPGAAFLYLVFTYRQCRDQELCCVYVVKSQLPPRGLE 
35 MT 

PPPPNRNVDFPIAVI^SSQSPRCSASQOTNPQFVDRLYRWQPDLRGRPT^ 
SP 

RCLHRTERFGAVGVPWILEGWWRPGGWRACA* 

40 Gene matched: gi | 139176 | sp | P22486 |VP19_HSV2G 
Gene name: CAPSID ASSEMBLY AND DNA MATU 



[SEQ ID NO:278] = 15 
45 ORF # = 41c 
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ORF start site = 76188 
ORF end site = 77588 
ORF sequence: 

MKTKPLPTAPMAWAESAVETTTS PRELAGHAPLRRVLRPP I ARRDGPVLLGDRAPRRTASTMWLLGI D 
5 PA 

ES S PGTRATRDDTEQAVDKI LRGARRAGGLTVPGAPRYHLTRQVTLTDLCQ PNAERAG ALLLALRH PT 
DL 

PHLARHRAPPGROTERLAEAWGQLLEASALGSGRAESGCARAGLVSFNFLVAACAAAYDARDAAEAW 
AH 

10 ITTNYGGTRAGARLDRFSECLRAMVHTHVFPHEVMRFFGGLVSWOT 
PG 

RPCSAOTIPACAFOTLDAELCLGGPGAAFLYLVFTYRQCR 
RL 

RITNTIHGAEDMTPPPPNRNVDFPLAVLAASSQSPRCSASQVTNPQFVDRLYRWQPDLRGRPTARTCT 
15 YA 

AFAELGVMPDNS PRCLHRTERFGAVGVPWI LEGWWR PGGWRACA * 

Gene matched: gi | 139176 | sp | P22486 |VP19_HSV2G 
Gene name: CAPSID ASSEMBLY AND DNA MATU 

20 

[SEQ ID NO:279] = 15 
ORF # = 46a 
ORF start site = 86432 
25 ORF end site = 87820 
ORF sequence: 

MAWCGSGLRLRPFHPPSPSFFVLRALIRAGPGPFAASPRAPSGPGCGMCRGDSPGVAGGSGEHCLGG 
DD 

GDDGRPRIACVGAIARGFAHLWLQATTLGFVGSVVLSRGPYADAMSGAFVIGSTGLGFLRAPPAFARP 
30 PT 

RVCAWLRLVGGGAAVALWSLGEAGAPPGVPGPATQCLA^ 
LG 

WVGGLTIGGSARYWWI DPRAAAALTAAWAGLGTTAAGDS FSKAC PRHRRFC WSAVES PPPRYAPE 
DA 

35 ERPTDHGPLLPSTHHQRSPRVCGDGAARPENIVATPVVTFAGAliALAACAARGSDA^ 
FV 

GGHAAAGLTELCQTIAPRDLTDPLLFAYVGFQWNHGL^^ 
LH 

KDPDAGPWAAATLRGLFFSVYALGFAAGVLVRPRMAASRRSG* 

40 

Gene matched: gi 1 136909 | sp| P10227 | UL43_HSV11 
Gene name: MEMBRANE PROTEIN UL43 



45 [SEQ ID NO: 280] = 15 
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ORF # = 46b 
ORF start site = 86576 
ORF end site = 87820 
ORF sequence: 

5 MCRGDS PGVAGGSGEHCLGGDDGDIX3RPRLACVGAI ARGFAHLWLQATTLGFVG S WLSRG PYADAMS 
GA 

FVIGSTGLGFLRAPPAFARPPTRVCAVflJRLVGGGAAVAL^ 
LV 

LADDVHPLFLLAPRPLFVGTLGVWGGLTIGGSARYWWIDPRAAAALTAAWAGI^ 
10 PR 

HRRFCWSAVESPPPRYAPEDAERPTDHGPLLPSTHHQRSPRVCGDGAARPENIWVPWTFAGALALA 
AC 

AARGSDAAPSGPVLPLVnPQVFVGGHAAAGLTELCQTLAPRDLTDPLLFAWGFQVVNHGLMFW 
VY 

1 5 AMLGGAVWI SLTQVLGLRRRLHKDPDAGPWAAATLRGLFF SVYALGFAAGVLVR PRMAASRRSG * 

Gene matched: gi | 136909 | sp| P10227 |UL43_HSV11 
Gene name: MEMBRANE PROTEIN UL43 

20 

[SEQ ID NO:281] = 15 
ORF # = 57a 

ORF start site = 100984 
ORF end site = 102942 
25 ORF sequence: 

MGTEIXTDHEGRSVAAPVEVMALYATIX3CVITSSLALLTNCLLGAEPLYIFSYDAYRPDAP 
EQ 

ERFEGSRALYRDAGGLNGDS FRVTFCLLGTEVGVTHHPKGRTRPMFVCRFERADDVAVLQDALGRGT P 
LL 

30 PAHITATLDLEATFALHANIIMALTVAIVHNAPARIGSGSTAPLYEPGESMRSWGRMSLGQRGLTTL 
FV 

HHEARVLAAYRRAYYGSAQSPFWFLSKFGPDEKSLVLAARYYVLQAPRL 
AI 

PHDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAAGFPLYVERRIAADVRETGALEKFIAHDRSCLR 
35 VS 

DREFITYIYLAHFECFSPPRLATHLRAVTTHDPSPAASTEQPSPIX3REAVEQFFRHVRAQLNIREYVK 
QN 

VTPRETAIAGDAAAAYLRARTYAPAALTPAPAYCGVADSSTKMMGRIiAEAERLLVPHGWPAFAPTTPG 
DD 

40 AGGGTAAPQTCGI VKRLLKLAATEQQGTTP PAI AALMQDASVQTPL PVYRI TMS PTGQAFAAAARDDW 
AR 

VTRDARPPEATWADAAAAPEPGALGRRLTRRI CARGPAPPPGRPGRRG PDVRE PQRDLQRRAGRYEH 
HP 

GSGHRPEGARPLSPAPRGPGSL* 
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Gene matched: gi | 136939 | sp|P10236 |UL52_HSV11 
Gene name: DNA HELICASE / PRIMASE COMPLEX 



5 [SEQ ID NO:282] = 15 
ORF # = 57b 

ORF start site = 100765 
ORF end site = 102942 
ORF sequence: 

1 0 MGAGKSALTTARASCSRGSXSEGGAAARI I S YCC SSGRVPQPHSTPSRDAI PEHARGS APAFPHPTPS 
GF 

AGAMGTEDCDHEGRSVAAPVEVMALYATIX3CVITSSLALLTNCLLGAEPLYIFSYDAYRPDAPNGPTG 
AP 

TEQERFEGSRALYRDAGGLNGDSFRVTFCLLGTEVGVTHHPKGRTRPMFVCRFERADDVAVLQDALGR 
15 GT 

PLLPAHITATLDLEATFALHANIIMALTVAIVHNAPARIGSGSTAPLYEPGESMRSVVGRMSLGQRGL 
TT 

LFVHHEARVLAAYRRAYYGSAQSPFWFLSKFGPDEKSLVLAARYYVLQ 
AT 

20 YAIPHDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAAGFPLYVERRIAADVRETGALEKFIAHDRS 
CL 

RVSDREFITYIYLMFECFSPPRLATHLRAVTTHDPSPAASTEQPSPLGREAVEQFFRHVRAQLNIRE 
YV 

KQlWTPRETAI^GDAAAAYLRARTYAPAALTPAPAYCGVADSSTKl^GRl^EAERLLVPHGWPAFAPT 
25 TP 

GDDAGGGTAAPQTCGI VKRLLKLAATEQQGTTPPAI AALMQDASVQT PLPVYRI TMS PTGQAFAAAAR 
DD 

WARVTRDARPPEIATWADAAAAPEPGALGRRliTRRICARG PAPP PGRPGRRG PDVREPQRDLQRRAGR 
YE 

30 HHPGSGHRPEGARPLSPAPRGPGSL* 

Gene matched: gi | 136939 | sp| P10236 |UL52_HSV11 
Gene name: DNA HELICASE / PRIMASE COMPLEX 

35 

[SEQ ID NO: 283] = 15 
ORF # = 57c 

ORF start site = 100678 
ORF end site = 102942 
40 ORF sequence: 

MQAWYVRARARAFTRRRVS S S DS RASS SVMG AGKSALTTARASC SRGSXS EGGAAARI I S YCCS SGRV 
PQ 

PHSTPSRDAI PEHARGS APAFPHPTPSGFAGAMGTEDCDHEGRSVAAPVEVMALYATDGCVTTSSLAL 
LT 
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NCLLGAEPLYIFSYDAYRPDAPNGPTGAPTEQERFEGSRALYRDAGGLNGDSFRVTFCLLGTEVGVTH 
HP 

KGRTRPMFVCRFERADDVAVLQDALGRGTPLLPAHITATLDLEATFALHANIIMALTVAIVHNAPARI 
GS 

5 GSTAPLYEPGESmSWGRMSLGQRGLTTLFVHHEARVIAAYRRAYYGSAQSPFWFLSKFGPDEKSLV 
LA 

ARYYVLQAPRLGGAGATYDLQAVKDICATYAIPHDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAA 
GF 

PLYVERRIAADVRETGALEKFIAHDRSCLRVSDREFITYIYLAHFECFSPPRLATHLRAVTTHDPSPA 
10 AS 

TTEQPSPK3REAVEQFFRHRAQLNIREYVKQNVTPRETALAGDAAAAYLRARTYAPAALTPAPAYCGV 
AD 

SSTKl^GRIJVEAERLLVPHGWPAFAPTTPGDDAGGGTAAPQTCGIVKRLLKLAATEQOG 
MQ 

15 DASVQTPLPVYRI TMS PTGQAFAAAARDDWARVTRDARPPEATVVADAAAAPEPGALGRRLTRR I CAR 
GP 

APPPGRPGRRGPDVREPQRDLQRRAGRYEHHPGSGHRPEGARPLSPAPRGPGSL* 

Gene matched: gi | 136939 | sp | P10236 | UL52_HSV11 
20 Gene name: DNA HELICASE / PRIMASE COMPLEX 



[SEQ ID NO: 284] = 15 
ORF # = 57d 
25 . ORF start site = 100624 
ORF end site = 102942 
ORF sequence: 

MVEPSSPGWVmASLSRLTMQAWYVRARARAFTRRRVSSSDSRASSSVMGAGKSALTTARASCSRGSXS 
EG 

30 GAAARIISYCCSSGRVPQPHSTPSRDAIPEHARGSAPAFPHPTPSGFAGAMGTEDCDHEGRSVAAPVE 
VM 

ALYATDGCVITSSLALLTNCLLGAEPLYIFSYDAYRPDAPNGPTGAPTEQERFEGSRALYRDAGGLNG 
DS 

frvtfcllgtevgvthhpkgrtrpmfvcrferaddvavlqdalgrgtpll pah itatldleatfalha 
35 ni 

imaltoaivhnaparigsgstaplyepgesmrsvvgrmslgqrglttlfvhhearvlaayrrayygsa 

QS 

PFWFLSKFGPDEKSLVLAARYYVLQAPRLGGAGATYDLQAVKDICATYAI PHDPRPDTLSAASLTSFA 
AI 

40 TRFCCTSQYSRGAAAAGFPLYVERRIAADVRETGALEKFIAHDRSCLRVSDREFITYIYLAHFECFSP 
PR 

LATHLRAVTTHDPS PAASTEQPS PLGREAVEQFFRHVRAQLNI REYVKQNVTPRETALAGDAAAAYLR 
AR 

TYAPAALTPAPAYCGVADSSTKMMGRLAEAERLLVPHGWPAFAPTTPGDDAGGGTAAPQTCGIVKRLL 
45 KL 
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AATEQQGTTPPAI AALMQDAS VQT PL PVYRITMS PTGQAFAAAARDDWARVTRDARPPEATWADAAA 
AP 

EPGALGRRLTRRICARGPAPPPGRPGRRGPDVREPQRDLQRRAGRYEHHPGSGHRPEGARPLSPAPRG 
PG 
5 SL* 

Gene matched: gi | 136939 | sp| P10236 |UL52_HSV11 
Gene name: DNA HELICASE/PRIMASE COMPLEX 

10 

[SEQ ID NO:285] = 15 
ORF # = 57e 

ORF start site = 100567 
RF end site = 102942 
15 ORF sequence: 

MHVSARRRILSRCAATAPSMTCPSSPGWWRASLSRLTMQA 
AG 

ksalttarascsrgsxseggaaariisyccssgrvpqphstpsrdaipehargsapafphptpsgfag 
am 

20 gtedcdhegrsvaapvevmalyatdgcvitsslalltncllgaeplyifsydayrpdapngptgapte 

QE 

RFEGSRALYRDAGGLNGDSFRVTFCLIX3TEVGVTHHPKGRTRPMFVCRFERADDVAVLQDALGRGTPL 
LP 

AHITATLDLEATFALHANIIMALTVAIVHNAPARIGSGSTAPLYEPGESMRSWGRMSLGQRGLTTLF 
25 VH 

HEARVIJyVYRRAYYGSAQSPFWFLSKFGPDEKSLVLAARYYVLQAPRLGGAGATYDLQAVKDICAT 
IP 

HDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAAGFPLYVERRIAADVRETGALEKFIAHDRSCLRV 
SD 

30 REFITYIYI^FECFSPPRLATHLRAVTTHDPSPAASTEQPSPLGREAVEQFFRHVRAQLNIREYVKQ 
NV 

TPRETAIxAGDAAAAYLRARTYAPAALTPAPAYCGVADSSTKMMGRLAEAERLLVPHGWPAFAPTTPGD 
DA 

gggtaapqtcgivkrllklaateqqgttppaiaalmqdasv^ 
35 rv 

trdarppeatwadaaaapepgalgrrltrricargpapppgrpgrrgpdvrepqrdlqrragryehh 

PG 

SGHRPEGARPLSPAPRGPGSL* 

40 Gene matched: gi | 136939 | sp|P10236 |UL52_HSV11 
Gene name: DNA HELICASE/PRIMASE COMPLEX 



[SEQ ID NO: 286] = 15 
45 ORF # = 57f 
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ORF start site = 100558 
ORF end site = 102942 
ORF sequence: 

WAMHVSARRRILSRCAATAPSMVEPSSPGWWRASLSRLTMQAWYVRARARAFTRRRVSSSDSRASSS 
5 VM 

GAGKS ALTTARASCSRGSXSEGGAAARI I S YCCSSGRVPQPHSTPSRDAI PEHARGSAPAFPHPTPSG 
FA 

GAMGTEDCDHEGRSVAAPVEVMALYATDGCVITSSL^^ 
PT 

1 0 EQERFEGSRALYRDAGGLNGDSFRVTFCLLGTEVGVTHH PKGRTRPMFVCRFERADDVAVLQDALGRG 
TP 

LLPAHITATLDLE^TFALHANIIMALTVAIVHNAPARIGSGSTAPLYEPGESMRSWGRMSLGQRGLT 
TL 

FVHH EARVIAAYRRAYYGSAQS PFWFLSKFGPDEKSLVIAARYYVLQA PRLGGAGATYDLQAVKDI CA 
15 TY 

AI PHDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAAGFPLYVERRIAADVRETGALEKFIAHDRSC 
LR 

VSDREFITYIYLAHFECFSPPRLATHLRAVTTHDPSPAASTEQPSPLGREAVEQFFRHVRAQLNIREY 
VK 

20 QNVT PRETALAGDAAAA YLRARTYAPAALT PAPAYCGVADS STKMMGRLAEAERLLVPHGWP AFAPTT 
PG 

DDAGGGTAAPQTCGIVKRLLKLAATEQQGTTPPAIAALMQDASVQTPLPVYRITMSPTGQAFAAAARD 
DW 

ARVTRDARPPEATWADAAAAPEPGALGRRLTRRICARGPAPPPGRPGRRGPDVREPQRDLQRRAGRY 
25 EH 

HPGSGHRPEGARPLSPAPRGPGSL* 



Gene matched: gi | 136939 |sp|P1023 6 |UL52_HSV11 
Gene name: DNA HELICASE / PRIMASE COMPLEX 

30 



[SEQ ID NO: 287] = 15 
ORF # = 57g 

ORF start site = 100543 
35 ORF end site = 102942 
ORF sequence: 

MYTCRMVAMHVSARRRI LSRCAATAPSMVEPS SPGWWRASLSRLTMQAWYVRARARAFTRRRVS SSDS 
RA 

SSSVMGAGKSALTTARASCSRGSXSEGGAAARI IS YCCSSGRVPQPHSTPSRDAI PEHARGSAPAFPH 

40 PT 

PSGFAGAMGTEDCDHEGRSVAAPVEVMALYATIX^VITSSIALLTNCLLGAEPLYIFSYDAYRPDAPN 
GP 

TGAPTEQERFEGSRALYRDAGGLNGDSFRVTFCLLGTEVGVTHHPKGRTRPMFVCRFERADDVAVLQD 
AL 
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GRGTPLLPAH ITATLDLEATFALHANI IMALOTAI VHNAPARIGSGSTAPLYEPGESMRSVVGRMSLG 
QR 

GLTTLFVHHEARVLAAYRRAYYGSAQSPFWFLSKFGPDEKSLVLAARYYVLQA 
KD 

5 ICATYAIPHDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAAGFPLYVERRIAADVRETGALEKFIA 
HD 

RSCLRVSDREFITYIYLAHFECFSPPRIATHLRAVTTHDPSPAASTEQPSPI^REAVEQFFTIHVRAQL 
NI 

REYVKQNVTPRETAIAGDAAAAYLRARTYAPAALTPAPAYCGVA^^ 
10 FA 

PTTPGDDAGGGTAAPQTCGIVKRLLKIAATEQQGTTPPAIAALMQDASVQTPLPVYRITMSPTGQAFA 
AA 

ARDDWARVTRDARPPEATWADAAAAPEPGALGRRLTRRICARGPAPPPGRPGRRGPDVREPQRDLQR 
RA 

1 5 GRYEHHPGSGHRPEGARPLS PAPRGPGSL * 

Gene matched; gi | 136939 | sp | P10236 | UL52_HSV11 
Gene name: DNA HELICASE / PRIMASE COMPLEX 

20 

[SEQ ID NO: 288] = 15 
ORF # = 57h 

ORF start site = 100483 
ORF end site = 102942 
25 ORF sequence: 

MLRMAWETSTSADLSAAPTDMYICRMVAMHVSARRRILSRCAATAPSMVEPSSPGWWRASLSRLTMQA 
WY 

VRARARAFTRRRVSSSDSRASSSVMGAGKSALTTARASCSRGSXSEGGAAARIISYCCSSGRVPQPHS 
TP 

30 SRDAIPEHARGSAPAFPHPTPSGFAGAMGTEDCDHEGRSVAAPVEVMALYATDGCVITSSLALLTNCL 
LG 

AEPLYIFSYDAYRPDAPNGPTGAPTEQERFEGSRALYRDAGGIiNGDSFRVTFCLLGTEVGVTHHPKGR 
TR 

PMFVCRFERADDVAVLQDALGRGTPLLPAHITATLDLEATFALHANI IMALTVAI VHNAPARI GSGST 

35 ap 

lyepgesmrswgrmsu3qrglttlfvhhearvlaayrrayygsaqspfwlskfgpd 

YV 

LQAPRLGGAGATYDLQAVKDICATYAIPHDPRPDTLSAASLTSFAAITRFCCTSQYSRGAAAAGFPLY 
VE 

40 RRIAADVRETGALEKFIAHDRSCLRVSDREFITYIYXAHFECFSPPRLATHLRAVTTHDPSPAASTEQ 
PS 

PIXREAVEQFFRHVRAQLNIREYVKQ1WTPRETALAGDAAAAYLRARTYAPAALTPAPAYCGVADSST 
KM 

MGRLAEAERLLVPHGWPAFAPTTPGDDAGGGTAAPQTCGIVKRLLK 
45 VQ 
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TPLPVYRITMSPTGQAFAAAARDDWARVTRDARPPEATWADAAAAPEPGALGRRLTRRICARGPAPP 
PG 

RPGRRGPDVREPQRDLQRRAGRYEHHPGSGHRPEGARPLSPAPRGPGSL* 

5 Gene matched: gi | 136939 |sp|P10236 |UL52_HSV11 
Gene name: DNA HELICASE/PRIMASE COMPLEX 



[SEQ ID NO: 289] =15 
10 ORF # = 57i 

ORF start site = 100242 
ORF end site = 102942 
ORF sequence: 

MTTSLSAMLRMAWETSTSADLSAAPTDMYICRMVAMHVSARRRILSRCAATAPSMVEPSSPGWWRASL 
15 SR 

LTMQAWYVRARARAFTRRRVSSSDSRASSSVMGAGKSALTTARASCSRGSXSEGGAAARIISYCCSSG 
RV 

PQPHSTPSRDAIPEHARGSAPAFPHPTPSGFAGAMGTEDCDHEGRSVAAPVEVMALYATDGCVITSSL 
AL 

20 LTNCLLGAEPLYIFSYDAYRPDAPNGPTGAPTEQERFEGSRALYRDAGGLNGDSFRVTFCLLGTEVGV 
TH 

H PKGRTRPMFVCRFERADDVAVLQDALGRGTPLLPAH ITATLDLEATFALHANI I MALTVAI VHNAPA 
RI 

GSGSTAPLYEPGESMRSWGRMSLGQRGLTTLFVHHEARVLAAYRRAYYGSAQSPFWFLSKFGPDEKS 
25 LV 

LAARYYVLQAPRLGGAGATYDLQAWDICATYAIPHDPRPim.SAASLTSFAAITRFCCTSQYSRGAA 
AA 

GFPLYVERRIAADTOETGALEKFIAHDRSCIJIVSDREFITYIYLAHFECFSPPRIATHLRAVTTHDPS 
PA 

30 ASTEQPSPIX3REAVEQFFRHVRAQLNIREYVKQNOTPRETALAGDAAAAYLRARTYAPAALTPAPAYC 
GV 

AADS STKMMGRLAEAERLLVPGWPAFAPTTPGDDAGGGTAAPQTCGI VKRLLKLAATEQQGTT PPAI A 
AL 

MQDASVQTPLPVYRITMSPTGQAFAAAARDDWARVTRDARPPEATW^ 
35 AR 

GPAPPPGRPGRRGPDVREPQRDLQRRAGRYEHHPGSGHRPEGARPLSPAPRGPGSL* 

Gene matched: gi 1 136939 |sp|P10236|UL52_HSVll 
Gene name: DNA HELICASE/PRIMASE COMPLEX 

40 

[SEQ ID NO: 290] =15 
ORF # = 61a 

ORF start site = 107456 
45 ORF end site = 108016 
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ORF sequence: 

MTTTPLSNLFLRAPDITHVAPPYCLNATWQAENALHTTKTDPACLAARSYLVRASCSTSGPIHCFFFA 
VY 

KDSQHSL PLVTELRNFADLVNHP PVLRELEDKRGGRLRCTGPFSCGT I KDVSGAS PAGEYT ING I VYH 
5 CH 

CRYPFSKTCWLGASAALQHLRSISSSGTAARAAEQRRHKIKIKIKV* 

Gene matched: gi | 136947 | sp| P2 8281 |UL55_HSV2H 
Gene name: PROTEIN UL55 A Agi | 73806 |pir | |W 

10 

[SEQ ID NO: 2913 = 15 
ORF # = 61b 

ORF start site = 107372 
ORF end site = 108016 
15 ORF sequence: 

MWGPGPARFIARPGTHGRRWTDPPPRNMTTTPLSNLFLRAPDITHVAPPYCLNATWQAENALHTTKT 
DP 

ACLAARSYLVRASCSTSGPIHCFFFAVYKDSQHSLPLVTELRNFADLVNHPPVLRELEDKRGGRLRCT 
GP 

20 FSCGTIKDVSGASPAGEYTINGIVYHCHCRYPFSKTCWLGASAALQHLRSISSSGTAARAAEQRRHKI 
KI 

KIKV* 



25 Gene matched: gi | 136947 | sp| P28281 |UL55_HSV2H 
Gene name: PROTEIN UL55*Agi | 73806 |pir | |W 

[SEQ ID NO: 292] = 15 
ORF # = 6a 
30 ORF start site = 6446 
ORF end site = 8482 
ORF sequence: 

MAAQRARAPAMRTRGGDAALCAPEDGWVKVHPTPGTMLFR 
LQ 

35 AAI FHALLNATTYRDLE EDWRRHWARGLQ PQRLVRRYRNAREGDI AGVAERVFDTWRCTLRTTLLDF 
AH 

GVVDCFAPGGPSGPTSFPKYIDWLTCLGLVPILRKTREGEATQRLGAFLRQHTLPRQLATVAGAAER^ 
GP 

GLLELAVAFDSTRMAEYDRVHI YYNHRRGEWLVRDPVSGQRGECLVLCPPLWTGDRLVFDS PVQRLC P 
40 EI 

VACHALREHAHICRLRNTASVKVLLGRKSDSERGVAGAAARVVNKALGEDDETKAGSAASRLTO 
KG 

MRHVGDINDTVRAYLDEAGGHLIDTPAVDHTLPGFGKGGTGRGSAAQDPGARPQQLRQAFQTAVVNNI 
NG 
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MLEG YINNLFGTI ERLRETNAGLATQLQARDRELRRAQAGALEREQRAADRAAGGGAGRPAEADLLRA 
DY 

DI I DVSKSMDDDTYVANS FQHQYI PAYGQDLERL SRLWEHELVRCFK ILRHRNNQGQETS I SYSSGAI 
AS 

5 FVAPYFEYVLRAPRAGALITGSDVILCEEELWEAVFKKTRLQTYLTDV 
PA 

DFRASASPRGGSRSRTRTRSRS PGRTPRGAPDQGWGVERRDGRPHARR * 

10 Gene matched: gi | 136794 | sp | P10190 |UL06_HSV11 
Gene name: VIRION PROTEIN UL6"Agi 1 73994 | 

[SEQ ID NO: 293) = 15 
ORF # = 6b 
15 ORF start site = 6326 
ORF end site = 8482 
ORF sequence: 

MDVKFKNASSLNRTAGLAPGCCGGGPGARTSREPSPPDAAMAAQRARAPAMRTRGGDAALCAPEDGWV 
KV 

20 HPT PGTMLFREI LLGQMG YTEGQGVYNWRS S EAATRQLQAAI FHALLNATTYRDLEEDWRRHWARG 
LQ 

PQRLVRRYRNAREGDIAGVAERVFDTWRCTLRTTLLDFAHGVVICFAPGGPSGPTSFPKYIDWLTCLG 
LV 

PILRKTREGEATQRIX5AFLRQHTLPRQLAWAGAAERAGPGLLEKAVAFDSTRMAEYDRVHIYYNHRR 
25 GE 

WLWDPVSGQRGECLVLCPPLWTGDRLVFDSPVQRLCPEIVACHALREHAHICRLRNTASVKVLLGRK 
SD 

SERGVAGAARVVNKALGEDDETKAGSAASRLTOLIINMKGMRHVGDINDTVRAYLD 
DH 

30 TL PGFGKGGTGRGSAAQDPGARPQQLRQAFQTAWNNINGMLEGYINNLFGT I ERLRETNAGLATQLQ 
AR 

DRELRRAQAGALEREQRAADRAAGGGAGRPAEADLLRADYDI IDVSKSMDDDTYVANSFQHQYI PAYG 
QD 

LERLSRLWEHELVRCFKILRHRNNQGQETSISYSSGAIASFVAPYFEYVLRAPRAGALITGSDVILGE 
35 EE 

LWEAVFKKTRLQTYLTDVAALFVADVQHAALPRPPSPTPADFRASASPRGGSRSRTRTRSRSPGRTPR 
GA 

PDQGWGVERRDGRPHARR * 

40 Gene matched: gi | 136794 | sp| P10190 |UL06_HSV11 
Gene name: VIRION PROTEIN UL6*Agi | 73994 | 

[SEQ ID NO: 294] = 15 
ORF # = 6c 
45 ORF start site = 6296 
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ORF end site = 8482 
ORF sequence: 

MRAMIGWTPCMDVKFKNASSLNRTAGI^ 
AL 

5 CAPEDGWVKVHPTPGTMLFREILLGQMGYTEGQGVYNVVRSSEAATRQLQAAI FHALLNATTYRDLEE 
DW 

RRHWARGLQPQRLVRRYRNAREGDI AGVAERVFDTWRCTLR PSG PTS FP 

KY 

IDWLTCLGLVPILRKTREGEATQRLGAFLRQHTLPRQIATVAGAAERAGPGLLELAVAFDSTRMAEYD 
10 RV 

HIYYNHRRGEWLVRDPVSGQRGECLVLCPPLWTGDRLVFDSPVQRLCPEIVACHALREHAHICRLRNT 
AS 

VKVLIXSRKSDSEIRGVAGAARVVNKAIiGEDDETKAGSAASRIiVRLIINM 
GG 

15 HL IDTPAVDHTL PGFGKGGTGRGS AAQDPGARPQQLRQAFQTAVVNNINGMLEGY INNLFGTI ERLRE 
TN 

AGLATQLQARDRELRRAQAGALEREQRAADRAAGGGAGRPAEADLLRADYDI I DVSK SMDDDTYVANS 
FQ 

HQYI PAYGQDLERLSRLWEHELVRCFKI LRHRNNQGQETS I SYS SGAI ASFVAPYFEYVLRAPRAGAL 

20 it 

GSDVILGEEELWEAVFKKTRLQTYLTDVAALFVADVQHAALPRPPSPTPADFRASASPRGGSRSRTRT 
RS 

RS PGRTPRGAPDQGWGVERRDGRPHARR* 

25 Gene matched: gi 1 136794 | sp| P10190 | UL06__HSV11 
Gene name: VIRION PROTEIN UL6~Agi | 73994 | 

[SEQ ID NO:295] = 15 
ORF # = 6d 
30 ORF start site = 6167 
ORF end site = 8482 
ORF sequence: 

MRYAANGNSRSGRPVGTSKAATSRNHCRRGTCVTSSCCCESSRMRAMIGOT 
GL 

35 APGCCGGGPGARTSREPSPPDAAMAAQR^RAPAMRTRGGDAALCAPETOWVKVHPTPGTMLFREILLG 
QM 

G YTEGOGVYNWRS S EAATRQ LQAAI FHALLNATTYRDLEEDWRRHWARGLQ PQRL VRRYRNAREGD 
IA 

gvaervfdtwrctlrttlldfahgv\hx:fa 

40 GA 

FLRQHTLPRQLATVAGAAERAGPGLLELAVAFDSTRMAEYDRVHIYYOT 
VL 

C PPLWTGDRLVFDS PVQRLC PEI VACHALREHAH ICRLRNTASVKVLLGRKSDSERGVAGAARWNKA 
LG 
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EDDETKAGSAASRLVRLI INMKGMRHV^ 
AQ 

DPGARPQQLRQAFQTAVVNNINGMLEGYINNLFGTIERLRETN^ 
QR 

5 AADRAAGGGAGRPAEADLLRADYDI IDVSKSMDDDTYVANSFQHQYI PAYGQDLERLSRLWEHELVRC 
FK 

ILRHRNNQGQETTSISYSSGAIASFVAPYFEYVLRAPRAGALITGSDVII^EEELV^AVFKKTRLQTYL 
TD 

VAALFVADVQHAALPRPPSPTPADFRASASPRGGSRSRTRTRSRSPGRTPRGAPDQGWGVERRDGRPH 
10 AR 
R* 

Gene matched: gi | 136794 | sp | P10190 |UL06_HSV11 
Gene name: VIRION PROTEIN UL6 A Agi ) 73994 | 

15 

[SEQ ID NO:296) = 15 
ORF # = 6e 

ORF start site = 6065 
ORF end site = 8482 
20 ORF sequence: 

MFCAAIRVAPVTTQSRTSLRVCTHVLFPDPALPVMRYAANGNSRSGRPVGTSKAATSRNHCRRGTCVT 
SS 

CCCESSRMRAMIGWTPCMDVKFKNASSLNRTAGLAPGCCGGGPGARTSREPSPPDAAMAAQRARAPAM 
RT 

25 RGGDAALC A PEDGWVKVH PTPGTMLFRE ILLGQMGYTEGQGVYNWRS S EAATRQLQAAI FHALLNAT 
TY 

RDLEEDWRRHWARGLQPQRLVRRYRNAREGDIAGVAERVFI^^ 
SG 

PTSFPKYIDV^TCIXSLVPILRKTREGEATQRI^AFLRQHTLPRQIAWAGAAERAGPGLLELAVAF 
30 TR 

MAEYDRVH I YYNHRRGEWLVRDPVSGQRGECLVLC PPLWTGDRLVFDS PVQRLC PEI VACHALREHAH 
IC 

RLRNTASVKVLIX3RKSDSERGVAGAARVWKAI/3EOT 
RA 

35 YLDEAGGHLIDTPAVDHTLPGFGKGGTGRGSAAQDPGARPQQLRQAFQTAWNNINGMLEGYINNLFG 
TI 

ERLRETNAGIATQLQARDRELRRAQAGALEREQRAADRAAGGGAGRPAEADLLRADYDIIDVSKSMDD 
DT 

YVANSFQHQYI PAYGQDLERLSRLWEHELVRCFKILRHRNNQGQETSI SYS SGAIASFVAPYFEYVLR 
40 AP 

RAGAL ITGSDVILGEEELWEAVFKKTRLQTYLTDVAALFVADVQHAAL PRP PS PTPADFRASAS PRGG 
SR 

SRTRTRSRS PGRTPRGAPDQGWGVERRDGRPHARR* 
45 Gene matched: gi | 136794 | sp| P10190 |UL06_HSV11 
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Gene name: VIRION PROTEIN UL6 A Agi | 73994 | 

[SEQ ID NO:297} » 15 
5 ORF # = 6f 

ORF start site = 6026 
ORF end site = 8482 
ORF sequence: 

MGRLRNAPESLTYMFCAAIRVAPVTTQSRTSLRVCTHVLFPDPALPVMRYAANGNSRSGRPVGTSKAA 
10 TS 

RNHCRRGTCVTSSCCCESSRMRAMIGWTPCMDVKFKNASSLNRTAGLAPGCCGGGPGARTSREPSPPD 
AA 

MAAQRARAPAMRTRGGDAALCAPEDGWVKVHPTPGTMLFRE ILLGQM S EAATRQ 

LQ 

1 5 AAI FHALLNATTYRDLEEDWRRHWARGLQ PQRLVRRYRNAREGDI AGVAERVFDTWRCTLRTTLLDF 
AH 

GVVIXrFAPGGPSGPTSFPKYIDWLTCLGLVPILRKTREGEATQRLGAFLRQHTLPRQLATVAGAAERA 
GP 

GLLEIJVVAFDSTRMAEYDRVH I YYNHRRGEWLTODPVSGQRGECLVLC PPLOTGDRLVFDS PVQRLC P 
20 EI 

VACHALREHAHICRLRNTASVK^LGRKSDSERGVAGAARVWKALGEDDETKAGSAASRLVRLIINM 
KG 

MRHVGDINDTVRAYLDEAGGHLIDTPAVDHTLPGFGKGGTGRGSAAQDPGARPQQLRQAFQTAVVNNI 
NG 

25 MLEG YI NNLFGTI ERLRETNAGLATQLQARDRELRRAQAGALEREQRAADRAAGGGAGRPAEADLLRA 
DY 

DI I DVSKSMDDDTYVANSFQHQYI PAYGQDLERLSRLWEHELVRCFKI LRHRNNQGQETS IS YSSGAI 
AS 

FVAPYFEYVLRAPRAGALITGSDVILGEEELWEAVFKKTRLQTYLTDVAALFVADVQHAALPRPPSPT 

30 pa 

dfrasasprggsrsrtrtrsrspgrtprgapdqgwgverrdgrpharr* 

Gene matched: gi | 136794 | sp | P10190 |UL06_HSV11 
Gene name: VIRION PROTEIN UL6 A Agi | 73994 | 

35 

[SEQ ID NO: 298] = 15 
ORF # = 6g 

ORF start site = 6017 
ORF end site = 8482 
40 ORF sequence: 

MVLMGRLRNAPESLTYMFCAAIRVAPVTTQSRTSLRVCTHVLFPDPALPVMRYAANGNSRSGRPVGTS 
KA 

ATSRNHCRRGTCVTSSCCCESSRMRAMIGWTPCMDVKFKNASSLNRTAGIAPGCCGGGPGARTSREPS 
PP 
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DAAMAAQRARAPAMRTRGGDAALCAPEDGWVKVHPTPGTMLFREI LLGQMGYTEGQGVYNWR S SEAA 
TR 

QLQAAIFHALLJ^ATTYRDLEEDVTOIHW 
LD 

5 FAHG WDCFAPGGPSG PTS F PKYI DWLTCLGLVPI LRKTREGEATQRLGAFLRQHTLPRQLATVAGAA 
ER 

AG PGLLELAVAFDS TRMAEYDRVH I YYNHRRGEWLVRDPVSGQRGECLVLCPPLWTGDRLVFDS PVQR 
LC 

PEIVACHALREHAHICRLRNTASVKVLLGRKSDSERGVAGAARVVNKALGEDDETKAGSAASRLW 
10 IN 

MKGMRHVGDI NDTVRAYLDEAGGHLI DTPAVDHTLPGFGKGGTGRGSAAQD PGARPQQLRQAFQTAW 
NN 

INGMLEGYINNLFGTIERLRETNAGLATQLQARDRELRRAQAGALEREQRAADRAAGGGAGRPAEADL 
LR 

15 ADYDIIDVSKSMDDDTYVANSFQHQYIPAYGQDLERLSRLWEHELVRCFKILRHRNNQGQETSISYSS 
GA 

I ASFVAPYFE YVLRAPRAGAL ITGS DVI LGEEELWEAVFKKTRLQTYLTDVAALFVADVQHAAL PRP P 
SP 

TPADFRASASPRGGSRSRTRTRSRSPGRTPRGAPDQGWGVERRDGRPHARR* 

20 

Gene matched: gi | 136794 | sp | P10190 | UL06_HSV11 
Gene name: VIRION PROTEIN ULG^Agi | 73994 | 

25 [SEQ ID NO: 299] = 15 
ORF # = 47a 
ORF start site = 88122 
ORF end site = 89564 
ORF sequence: 

30 MALGRVGLAVGLWGLLWVGWWLANAS PGRTI TVGPRGNASNAAPSAS PRNAS APRTTPTPPQPRKA 
TK 

SKASTAKPAPPPKTGPPKTSSEPVRCNRHDPLARYGSRVQIRCRFPNSTRTESRLQIWRYATATDAEI 
GT 

APSLEEVMVNVSAPPGGQLVYDSAPNRTDPHVI WAEGAGPGAS PRLYS WG PLGRQRL 1 1 EELTLETQ 
35 GM 

YYVA/WGRTDRPSAYGTWVRTOWRPPSLTIHPHAVLEGQPFKATCTAATYYPGNRAEFVWF^ 
DP 

AQIHTQTQENPDGFSTVSTVTSAAVGGQGPPRTFTCQLTWHRDSVSFSRRNASGTASVLPRPTITMEF 
TG 

40 DHAVCTAGCVPEGVTFAWFLGDDSS PAEKVAVASQTSCGRPGTAT IRSTLPVS YEQTEYICRLAGYPD 
GI 

PVLEHHGSHQPPPRDPTERQVIRAVEGAGIGVAVLVAVVLAGTAVVYLTHASSVRYRRLR* 

Gene matched: gi | 138220 | sp | P06475 |VGLC_HSV23 
45 Gene name: GLYCOPROTEIN C PRECURSOR^Agi | 
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[SEQ ID NO: 300] = 15 
ORF # = 47b 
5 ORF start site = 87918 
ORF end site = 89564 
ORF sequence: 

MGAGVPWTGI KARGAGGPI TVRVLGWEVAQKATH PCC SCPREAWSGNP PRCAGRAHRS FAGAGALLV 
MA 

10 LGRVGIiAVGLWGLLWVGWWLANAS PGRTI TVGPRGNASNAAPSAS PRNAS APRTTPTP PQPRKATK 
SK 

ASTAKPAPPPKTGPPKTSSEPVRCNRHDPLARYGSRVQIRCRFPNSTRTESRLQIWRYATATDAEIGT 
AP 

SLEEVMVNVSAPPGGQLVYDSAPNRTDPHVIWAEGAGPGASPRLYSWGPLGRQRLIIEELTLETQGM 
15 YY 

WVWGRTDRPSAYGTWVRVRVFRPPSLTIHPHAVLEGQPFKATCTAATYYPGNRAEFVWFEDGRRVFDP 
AQ 

IHTQTQENPDGFSTVSTVTSAAVGGQGPPRTFTCQLTWHRDSVSFSRRNASGTASVLPRPTITMEFTG 
DH 

20 AVCTAGCVPEGVTFAWFLGDDSSPAEKVAVASQTSCGRPGTATIRSTLPVSYEQTEYICRLAGYPDGI 
PV 

LEHHGSHQPPPRDPTERQVIRAVEGAGIGVAVIiVAWLAGTAWYLTHASSVRYRRLR* 

25 Gene matched: gi | 138220 | sp| P06475 |VGLC_HSV23 
Gene name: GLYCOPROTEIN C PRECURSORS Agi | 

[SEQ ID NO:301] = 15 
30 ORF # = 52a 

ORF start site = 97076 
ORF end site = 95441 
ORF sequence: 

MSVLGDARHPRRFPSRGPRPFSVAGPGSIiPPSPPPGARARLIRLSRSLFPDPTAPMDLLVDDLFADAD 
35 GV 

SPP PPRPAGGPKNTPAAP PLYATGRLSQAQLMPS PPMPVPPAALFNRLLDDLGFSAGPALCTMLDTWN 
ED 

LFSGFPTNADMYRECKFLSTLPSDVIDWGDAHVPERSPIDIRAHGDVAFPTLPATRDELPSYYEAMAQ 
FF 

40 RGELRAREESYRTVLANFCSALYRYLRASWQLHRQAHMRGRNRDLREMLRCT 
LF 

LHLYLFLSREILWAAYAEQMMRPDLFDGLCCDLESWRQIACLFQPI^FINGSLTVRGVPVEARRLREL 
NH 

IREHLNLPLVRSAAAEEPGAPLTTPPVLQGNQARSSGYFMLLIRAKLDSYSSVATSEGESVMREHAYS 
45 RG 
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RTRNNYGSTIEGLLDLPDDDDAPAEAGLVAPRMSFLSAGQRPRRLSTTAPITDVSLGDELRLDGEEVD 
MT 

PADALDDFDLEMLGDVES PS PGMTHDPVS YGALDVDDFEFEQMFTDAMG IDDFGG * 

5 Gene matched: gi | 1168549 | sp| P29793 |ATIN_HSV23 
Gene name: ALPHA TRANS -INDUCING PROTEIN 



[SEQ ID NO: 302] = 15 
10 ORF # = 52b 

ORF start site = 969103 
ORF end site = 95441 
ORF sequence: 

MDLLVDDLFADADGVSPPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPVPPAALFNRLLDDLG 
15 FS 

AGPALCTMLDTWNEDLFSGFPTNADMYRECKFLSTLPSDVIDWGDAHVPERSPIDIRAHGDVAFPTLP 
AT 

RDELPSYYEAMAQFFRGELRAREESYRTVLANFCSALYRYLRASWQLHRQAHMRGRNRDLREMLRCT 

20 DRYYRETARIxARVLFLHLYLFLSREILWAAYAEQMMRPDLFE)GLCCDLESWRQLACLFQPLMFINGSL 
TV 

RGVPVEARRLRELNHIREHLNLPLVRSAAAEEPGAPLTTPPVLQGNQARSSGYFMLLIRAKLDSYSSV 
AT 

SEGESVMREHAYSRGRTRNNYGSTIEGLLDLPDDDDAPAEAGLVAPRMSFLSAGQRPRRLSTTAPITD 
25 VS 

LGDELRLDGEEVDMTPADALDDFDLEMLGDVESPSPGMTHDPVSYGALDVDDFEFEQMFTDAMGIDDF 
GG* 

Gene matched: gi 1 1168549 | sp| P29793 |ATIN_HSV23 
30 Gene name: ALPHA TRANS -INDUCING PROTEIN 

[SEQ ID NO: 303] = 15 
ORF # = 52c 
35 ORF start site = 97097 
ORF end site = 95441 
ORF sequence: 

MRGGGREMSVLGDARHPRRFPSRGPRPFSVAGPGSLPPSPPPGARARLIRLSRSLFPDPTAPMDLLVD 
DL 

40 FADADGVS PPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPVP PAALFNRLLDDLGFSAGPALC 
TM 

LDTWNEDLFSGFPTNADMYRECKFLSTLPSDVIDWGDAHVPERSPIDIRAHGDVAFPTLPATRDELPS 
YY 

EAMAQFFRGELRAREESYRTVLANFCSALYRYLRASWQLHRQAHMRGRNRDLREMLRTTIADRYYRE 
45 TA 



160 



WO 98/20016 



PCT7US97/20016 



RLARVLFLHLYLFLSREILWAAYAEQMMRPDLFDGLCCDLESWRQLACLFQPLMFINGSLTVRGVPVE 
AR 

RLRELNHIREHLtnjPLVRSAAAEEPGAPLTTPPVLQGNQARSSGYFMLLIRAKLDSYSSVATSEGES^ 
MR 

5 EHAYSRGRTRNNYGSTIEGLLDLPDDDDAPAEAGLVAPRMSFLSAGQRPRRLSTTAPITDVSLGDELR 
LD 

GEEVDMTPADALDDFDLEMLGDVESPSPGMTHDPVSYGALDVDDFEFEQMFTDAMGIDDFGG* 



10 Gene matched: gi | 1168549 | sp| P29793 |ATIN_HSV23 
Gene name; ALPHA TRANS -INDUCING PROTEIN 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

5 (i) APPLICANT: SmithKline Beecham Corporation 

(ii) TITLE OF THE INVENTION: Novel Coding Sequences from Herpes 
Simplex Virus Type-2 

10 (iii) NUMBER OF SEQUENCES : 303 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SmithKline Beecham Corporation 

(B) STREET: 709 Swedeland Road 
15 (C) CITY: King of Prussia 

(D) STATE: PA 

(E) COUNTRY: U.S.A. 

(F) ZIP: 19046 

20 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: Windows 

(D) SOFTWARE: FastSEQ for Windows Version 2.0b 

25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 
<B) FILING DATE: 
(C) CLASSIFICATION: 

30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/049,018 

(B) FILING DATE: 09-JUN-1997 

35 (A) APPLICATION NUMBER: 60/030,279 

(B) FILING DATE: 04-NOV-1996 



(viii) ATTORNEY / AGENT INFORMATION: 
40 (A) NAME: Geiger, Kathleen W. 

(B) REGISTRATION NUMBER: 35,880 

(C) REFERENCE/ DOCKET NUMBER : P50583 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 610-270-5968 

(B) TELEFAX: 610-270-5090 

(C) TELEX: 

5 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 8953 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 











1 LA- /ibL 1 vjVJVj 




LvjCG TTCbTC 


ou 




TCTCGCAGGC 


GTTCTATGGT 


TCCAAAGAGA 


TTATTGATAT 


AGCCCTCCAG 


CATGCCGTTG 


120 


20 


ATGTTGTTGA 


CCACGGCCGT 


CTGAAACGCC 


TGGCGAAGCT 


GCTGCGGTCG 


CGCCCCCGGG 


180 




TCCTGGGCCG 


CCGACCCGCG 


GCCGGTGCCG 


CCCTTGCCGA 


ACCCAGGGAG 


GGTGTGGTCG 


240 




ACGGCGGGGG 


TGTCGATCAG 


GTGCCCCCCC 


GCCTCGTCCA 


AGTAGGCGCG 


TACCGTGTCG 


300 




TTGATGTCGC 


CCACGTGGCG 


CATGCCCTTC 


ATGTTGATGA 


TGAGCCGCAC 


GAGACACGAG 


360 




GCGGCCGAGC 


CGGCCTTCGT 


CTCGTCATCC 


TCCCCCAGCG 


CCTTATTGAC 


GACCCGCGCG 


420 


25 


GCGCCAGCCA 


CCCCGCGCTC 


GCTGTCGCTC 


TTGCGCCCCA 


ACAGCACCTT 


GACGGACGCG 


480 




GTGTTGCGCA 


GACGGCAGAT 


GTGCGCGTGT 


TCCCGGAGGG 


CGTGGCACGC 


GACGATCTCG 


540 




GGGCACAGCC 


GCTGAACGGG 


CGAATCGAAG 


ACCAGGCGGT 


CGCCGGTCCA 


CAGGGGGGGG 


600 




CACAGCACCA 


GGCACTCGCC 


GCGCTGCCCG 


CTGACCGGGT 


CGCGCACCAG 


CCACTCCCCC 


660 




CGGCGATGGT 


TGTAGTAGAT 


GTGCACACGG 


TCGTATTCCG 


CCATGCGCGT 


GGAGTCGAAC 


. 720 


30 


GCGACGGCCA 


GCTCCAGAAG 


CCCCGGGCCG 


GCGCGCTCCG 


CGGCCCCGGC 


GACCGTGGCC 


780 




AGCTGCCGGG 


GCAGCGTGTG 


CTGCCTGAGA 


AACGCCCCCA 


GGCGCTGCGT 


CGCCTCCCCC 


840 




TCGCGCGTCT 


TGCGCAATAT 


GGGAACCAGC 


CCCAGACACG 


TCAGCCAGTC 


GATATATTTG 


900 




GGGAAGCTGG 


TCGGTCCGCT 


TGGGCCGCCC 


GGCGCAAAGC 


AGTTTACCAC 


CCCGTGGGCA 


960 




AAGTCCAGCA 


GCGTCGTCCT 


GAGCGTGCAT 


CGCCACGTGT 


CGAACACCCG 


CTCGGCCACC 


1020 


35 


CCGGCGATAT 


CGCCCTCCCG 


GGCGTTCCGG 


TACCTGCGAA 


CCAGCCGCTG 


CGGCTGGAGG 


1080 




CCGCGGGCCA 


CCACGTGGCG 


GCGCCAGTCC 


TCCTCCAGGT 


CCCGGTACGT 


TGTGGCGTTG 


1140 




AGGAGCGCGT 


GGAAGATCGC 


CGCCTGCAGC 


TGTCGGGTGG 


CGGCCTCGCT 


GGACCGGACG 


1200 




ACGTTGTACA 


CCCCCTGACC 


CTCGGTGTAC 


CCCATCTGCC 


CGAGGAGAAT 


CTCGCGGAAC 


1260 




AACATCGTCC 


CGGGGGTGGG 


GTGAACCTTC 


ACCCAGCCGT 


CCTCGGGGGC 


GCATAGCGCC 


1320 


40 


GCGTCGCCGC 


CCCGCGTCCG 


CATCGCCGGC 


GCCCGCGCGC 


GCTGTGCGGC 


CATGGCGGCG 


1380 




TCCGGCGGGG 


AGGGCTCGCG 


GGACGTCCGG 


GCACCAGGTC 


CGCCCCCACA 


GCAGCCCGGG 


1440 




GCCAGACCCG 


CCGTGCGGTT 


CAGGGACGAG 


GCGTTTTTAA 


ATTTTACGTC 


CATGCACGGG 


1500 




GTCCAACCAA 


TCATCGCACG 


CATCCGAGAG 


CTCTCGCAGC 


AGCAGCTCGA 


CGTCACGCAG 


1560 
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GTGCCGCGCC TGCAGTGGTT CCGGGACGTG GCGGCCTTGG AGGTCCCGAC 
CTCCGGGAGT TTCCGTTCGC GGCGTATCTC ATCACCGGCA ACGCCGGATC 
ACGTGCGTGC AGACCCTCAA CGAGGTCCTG GACTGCGTGG TCACGGGCGC 
GCGGCGCAGA ACATGTACGT TAAGCTCTCG GGGGCGTTTC TGAGTCGACC 
5 ATCTTTCACG AGTTCGGGTT TCGCGGGAAT CACGTCCAGG CCCAGCTGGG 
TACACCCTGG CCAGCAGCCC CGCCTCGCTG GAAGACCTGC AGCGGCGAGA 
TACTGGGAGG TGATCCTCGA CATCACCAAG CGGGCCCTGG CGGCGCACGG 
GCGCGAAACG AGTTCCACGC CCTCACCGCC CTAGAGCAGA CTTTGGGGCT 
GCCCTCACGC GCCTGGCCTC GGTCACACAC GGGGCGCTGC CGGCTTTCAC 
10 ATTATCGTCA TCGACGAGGC CGGGCTCCTG GGGCGGCACC TACTCACGAC 
TGCTGGTGGA TGATTAACGC CCTGTACCAC ACCCCCCAGT ACGCGGGCCG 
GTGCTGGTGT GCGTGGGGTC GCCGACCCAG ACGGCCTCGC TGGAGTCCAC 
CAAAAACTGC GATGCTCCGT CCGGCAGAGC GAAAACGTGC TCACGTACCT 
CGCACCCTAC GCGAGTACAC GCGCCTCTCG CACAGCTGGG CCATTTTCAT 
15 CGATGTGTGG AGCACGAGTT CGGGAACCTC ATGAAGGTGC TGGAGTACGG 
ACCGAGGAGC ACATGCAGTT TGTGGACCGC TTTGTCGTCC CGGAAAGTTA 
CCGGCCAACC TTCCGGGGTG GACGCGGCTG TTCTCGTCCC ACAAGGAGGT 
ATGGCCAAGC TCCACGCCTA CCTAAAGGTG ACTCGCGAGG GGGAGTTTGT 
CTCCCCGTGC TTACGTTTGT GTCGGTCAAA GAGTTTGACA AGTATCGACG 

20 CAACCCACGC TGACCATGGA AAAGTGGATC ACGGCCAACG CCAGTCGCAT 
TCCCAGAGTC AGGACCAGGA CGCGGGGCAC GTGCGCTGTG AGGTGCACAG 
CTAGTCGTGG CCCGGAACGA CATCACGTAC GTCCTCAACA GCCAGGTCGC 
CGCCTCCGAA AGATGGTGTT TGGGTTCGAC GGGACGTTTC GGACCTTCGA 
CGCGACGACA GCTTCGTGAA GACCCAGGGG GAGACCTCGG TGGAGTTCGC 

25 CTGTCGCGGC TCATGTTCGG CGGGCTGATT CACTTTTACA ACTTTCTCCA 
CTGGACGCGA CCCAGAGGAC CCTGGCCTAC GGCCGCCTAG GGGAGCTGAC 
CTGTCGCTAC GCCGGGACGC CGCCGGCGCA TCGGCAACCA GGGCCGCCGA 
CGCTCTCCGG GGGAGCGTGC GTTCAATTTT AAGCACCTGG GCCCGCGGGA 
GACGACTTCC CCGACGACGA CCTTGACGTT ATCTTCGCCG GGCTGGACGA 

30 GACGTGTTCT ACTGCCACTA CGCCCTCGAA GAGCCGGAGA CCACCGCGGC 
CAGTTTGGGC TCCTGAAGAG GGCCTTTCTG GGGCGATACC TTATCCTACG 
GGGGAGGTGT TTGAGAGCGC CCCCTTCAGC ACCTACGTGG ACAATGTCAT 
TGCGAGCTGC TGACCGGCTC GCCGCGCGGG GGGCTGATGT CCGTGGCCCT 
AACTACACGC TGATGGGGTA CACGTACACC CGGGTGTTCG CGTTCGCGGA 

35 CGGCGGCACG CGACGGCCGG CGTGGCCGAG TTCTTGGAGG AGTCCCCCCT 
GTCCTGCGGG ACCAGCACGG CTTCATGTCT GTCGTCAATA CCAACATCAG 
GAGTCGATCG ACTCCACGGA GCTGGCCATG GCCATCAACG CCGACTACGG 
AAACTCGCGA TGACCATCAC GCGCTCCCAG GGGCTCAGTC TGGACAAGGT 
TTCACGCCCG GAAACCTGCG CCTAAACAGC GCGTACGTAG CCATGTCCCG 

40 TCCGAGTTCC TGCACATGAA TCTAAACCCG CTCCGGGAGC GCCACGAACG 
ATTAGCGAGC ACATACTATC TGCTCTACGC GATCCGAATG TGGTCATTGT 
TCCATTCCCT CGCGTTCCCA CCGCACCCGG GCCGGGTGAC ATTCACCCCC 
GACATGGGGA ACCCCCAGAC GACCATCGCG TACAGCCTAC ATCACCCCAG 
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CGGCCTGCCG 

CGGAAAGAGT 

CACGCGAATC 

CATCAACACC 

GCAGCACCCG 

CCTGACGTAC 

GGGCGAAGAC 

GGGCCAGGGT 

CCGCAGCAAC 

CGTGGTGTAT 

CCTGCGGCCG 

CTTCGAACAC 

CATCTGCAAC 

TAACAACAAG 

CCTTCCCATC 

CATCACCAAC 

CAGCGCGTAC 

TGTGTTTACC 

GCTCACGCAG 

CACCAACTAC 

CAAGCAACAG 

GGTGACCGCG 

GGCTGTGCTG 

CTACCGGTTC 

GCGCCCCGGC 

GGCAGAACTC 

CACCAGCGAC 

CGGGGGCCCG 

ACAGCAGCTG 

CGTCCACGCC 

GGAGCTCTTC 

CTTCCGGGGC 

GCAGACGGAC 

GGAGCTGCGG 

GCCCTACATC 

TGAGTTTGTC 

CATCAGCTCC 

CGCCATATGC 

CACCACCTCA 

CGATGACGTC 

CTATTAACCC 

ACCCCCCCGA 

GGCGTCGCTA 



1620 
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3000 
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ACGAGCGCGC TGCCGGACGC GGCACAGGTG GTGCACGTGT TTGAGTCAGG GACGCGCGCG 4200 

GTTCTGACGC GGGGTCGAGC GCGCCAGGAC CGCCTGCCCC GCGGAGGCGT GGTGATACAA 4260 

CACACCCCCA TCGGGCTGCT GGTGATTATC GACTGTCGTG CCGAATTTTG CGCATACCGC 4320 

TTTATAGGAC GCGCTAGTAC CCAGAGGCTG GAGCGCTGGT GGGACGCCCA TATGTACGCG 4380 

5 TACCCCTTTG ACTCCTGGGT CAGCTCATCG CACGGCGAAA GCGTCCGGAG TGCGACGGCC 4440 

GGCATCCTGA CGGTGGTGTG GACCCCGGAC ACCATCTACA TCACCGCAAC GATCTACGGG 4500 

ACGGCCCCCG AGGCGGCGCG GGGGTGCGAT AACGCACCCC TGGACGTCCG CCCAACCACA 4560 

CCCCCCGCCC CCGTATCCCC AACGGCGGGC GAGTTCCCAG CAAACACAAC AGACCTACTG 4620 

GTCGAGGTTC TGCGGGAAAT TCAGATCAGC CCCACCCTGG ACGACGCAGA CCCAACCCCC 4680 

10 GGAACCTGAA ACCTTCTTTC CTCCCCACCC CGCCCGCTTG CATATTCCCT CTGCGCGCGG 4740 

CGACGGCACC GCCGGGCGAA CGAACGGTCA ATAAAATCAA TCAATCCATC ATCCAACAAA 4800 

ATAAGCTACG TGTTATTTAT TGAAACGTCA CAACACATCA GTAACGGGGG GAAGGGTAGG 4860 

GGGAAAAGAA AGGGGACGGC GGGGGTGCTT AGTCTGGTTC CGTAGACAGC ATGACGTTAT 4920 

CTCGATGGAG GCGCATGGGT TCGGACGAAA CAAACTCGTG TACAAACACG GGGACGGCCG 4980 

15 GGCAGAGCCG CCCATCCGAG GACCGCGTGT AATACTTGCG CGGCTTGCGC GACCGAATAA 5040 

CCGCCCGCAG CTGCTCGCGC ACCTGCGCGG CGTTGGCGCG CTTGCACAAA ACGAACATCT 5100 

GGAGGCTCTT GTTGCTGCGT GGTACGCATG TGCGTTGCGG CGGTGCTCCG CGCTTGCGCT 5160 

GGCGCGCGGC CGTCCCCGAA AACGACGAGG TCTTGGTACA CGCGATGCTG AACTTGGCCA 5220 

GCGACAGCCG CAGGTCCTTA CGGATCGTAT CCGTGAGCTG GCGGCGCCCC AGTTCGTCGA 5280 

20 TCGAAGATAC CATAAACAGA GTATCAAAGG TGACGTAGGG GCCGTCCGCC CCCGGCGGGT 5340 

CGCCGGCCAC GCTCGACGTA AGCGGGGAGA TCGCATCCCG GTCCGGCTCT CCGTCGAGAG 5400 

GCCCCACGTG CGCGTCCGGT GCGGTCCCCG CAATCGACTC TATGGGCGTC GTATTCGGCG 5460 

ACGCGCCAGG ATCATGGTTC TGGGGTGCAA ACGTCCAGCC CCACGAGGCA AGGATAGTAA 5520 

ACGCAGAGGG GGACCCTCTC TTCCCCCACG CCCGACATCA CGGACCGGTA TGAGACCCGA 5580 

25 GATTTAACCA TCAACAGTCT TTATTAATTG CCCCCTCGTA AATCAAGACC CCGGATGTCG 5640 

GCATCTTATA CCGACCAGTC GATAGGCATA ATGTCCCGGG TTTCGAGGTA GCGATTCGCG 5700 

GCGAGGAAAT GCTGGCACGT CCCAAACGGG ACCTTGGAGA GGGGCGACGG GTGAGAAAAC 5760 

TTGAGGACGT AGTGTTGGCG AGGGTCGGGC CTGATCGCGT TCTGGGCATG GGCGCCCCAG 5820 

AGCATAAAGA CCAGGCCCGG GCGGCGCGCG GCCAGCCGTC GGACCACCCC GCCCACAAAG 5880 

30 CGGTCCCATC CAAGCTTGGA GTGGGACGCC GCCGCCCCGC GCTTGACGGT CAGGGTCGTG 5940 

TTCAACAACA GCACGCCGTC GCGAGCCCAC TTTTCCAGGC AGCCGCGGCC GCTCATGCGC 6000 

GCGTCGGGGT AACAATTTTT TACCGCCGCC AGCACGTTCC GTAGACTCGG AGGCACCGGC 6060 

ACATCCGCAC GCACGCTAAA CGCCAGGCCG TGCGCCTGGC CGGGATGGTG GTACGGGTCC 6120 

TGCCCGATGA TAACCACGCG CACGTCGTCG GGGGTACAAT ACCGCGTCCA GGAGAACACA 6180 

35 TCCTCCCGCG GCGGCAGCAC CTCTTCGGTC TGGCACCGAC GGTCATACTC CGCGAGGAGG 6240 

CGCGCGGTTA GGGGGTTCGC GAGCTCCGGC TCCAACAGGG GCCGCCAGGC GTCGTCGATC 6300 

AGAAACGCGC GCCGAAACGC GGCCCAGTCC AGGGGCACGG AAGTCGGCAG GGGCGGCGTC 6360 

GCATCGCCCG ACAGCCCCAG GGGCTGTTCG GGGGTTGTAG ATGCGGAAAA CATCACGCCC 6420 

GGCGGACAGC CTCTAGGGCG ACGACGCTGG ACAGCCGACC CGGCCGCTAG ACGGGTCGGA 6480 

40 TATCCTGCTC CCGACCCAGG GTGGCTTGCG ATGCAGACGT GGCTAGTTGC GTCGGAGGCG 6540 

AGTATGCCGG CGCCGACCTC GCGTCGGGGA GACCCGCCGT GGGGGGGCGT TCGAAAGGGC 6600 

GAGGACGGGC GGCTGGGTGG CGAGGGGCTT CGACTGCGAG CTCGCTTCAT CGTCCGACGA 6660 

CACAGGCGGT TCCCACGTTG ACGTGGTGTT GGCAGGCCGT AAATCGCGTC GCCCGACGCA 6720 
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GCGGCGAGTG 


CGTGAGTAGT 


CAAAGTTTAC 


ACACCCGGCC 


CTGACGGGTG CGTGGCTGAC 


6780 




GGCCTGTTTT 


CGACTGCCCA 


ACGCATCGCG 


TATCTCTTTA 


TAAAGGGCCC 


GGCGCGTCGT 


6840 




TGTTTCCTGG 


GTGTCGGCCG 


GAAACACAGA 


GTGACTCAAG 


TCCTCCAAAA 


ATCCCCCCGC 


6900 




AAAGAGAAAG 


GGGTTAACCA 


GATACGCCCT 


CTGGGCGTGC 


CTATCCCACA 


AAAACGTGTC 


6960 


5 


CAACCCCGGG 


CAGTGATAGC 


GAAGAAATAT 


TCCGTCTATG 


CGGGCATAGT 


CAATAACGGA 


7020 




CGGGGCCTCG 


TAGCGCCAAG 


AAACATCGTC 


CGCGGGGGTC 


CGCATGCAAG 


GCACTCTTAG 


7080 




TATGTCCCCC 


ACCTCTTTGG 


CAATAACACT 


ACGAAGAACA 


TATTCGGTTG 


CCTGTGACCC 


7140 




ACCCCACGCC 


CCCCAGGGTC 


CCATAACGAC 


AAGCCCAAAC 


AGACAGACGA 


ACCCCATAGC 


7200 




GAGCGGACAG 


TGTAACCGGT 


AAGCCCCCTT 


GTTCCCGCAT 


AAAAAACGTC 


CAAACAAGAC 


7260 


10 


AACCGCGAGC 


AACCGAATCA 


CGCGGGTCCA 


ATATGCCCAT 


TCCCGCGCTT 


TCTACCGCTT 


7320 




TATATATCCC 


CCGTGTCCTC 


CCCTCCCCCG 


CGTCCTCCCA 


TCCCCCGCGT 


CCTCCCTTCC 


7380 




CCCGCGTCCT 


CCCATCCCCC 


GCGTCCTCCC 


CTCCCCTGCG 


CACACGTGAT 


AGGTTTTGGG 


7440 




AACCCGAGGG 


GCGACGCGGG 


GAAAGCGCGC 


CCCCGCCCGG 


CCGCCGAGCG 


CCCCCGCCCG 


7500 




GCAGCCGAGC 


GCCCCCGCCC 


GGCCACCGCG 


AGCCCCCGCC 


CGGCCGCCCG 


GGTCGCGCCG 


7560 


15 


GCGCCCCCTC 


CCGGCGCTTC 


CGGGGCCTTT 


CTGTCGTTCC 


CCGCCGGGAC 


CCCGGCCCCG 


7620 




CCCCACCGCC 


CCGCCCGGCA 


GGGGGGCCCC 


GGCGCCGCGC 


AAAACACACA 


GACGAACACA 


7680 




CGGTGGCGAT 


CTTTTCTTTA 


CTTCGGCAGA 


CCAGCGAGCC 


CCGGCCCCGG 


CCCGCGCCCC 


7740 




GCCGCCACAC 


CCACGGCACC 


CCCCCCGCCG 


CCCACCCCGG 


GGTCCACACA 


GGAGCGCGCG 


7800 




GGCGGCAAAA 


ACGCGGGCGT 




TTTTTTTCCC 


CTTTTTCTCC 


TCTTTTTCTC 


7860 


20 


CCCCTCTTTC 


TTTTCCTTCC 


CCTTTTTCTT 


CTTCCCTCTC TCTTTCTTTC TCCTTCTCTC 


7920 




TCTTCTCTTT 


CTCTTTCTCT 


TTCTTCCTCT 


CTCTTTTCTT 


CTCCTCCTCC 


TATCCTCTTA 


7980 




TCTGTCCACT 


TCTCCTTCTT 


TTTTCTTGCC 


TGCTGTTTCT 


CTTTTTCTTT 


CTCTTTCTTC 


8040 




TATTCTCTGC 


CTCTCTCCTT 


CTTACTTCTC 




CTATCTACTG 


TCACTCTATC 


8100 




TCTTTCCCTT 


CTTTTATATG 


TGTCGTATTC 


ATTCTTTCTC 


TACTCACGTT 


ACTCTATCTA 


8160 


25 


TCTTCTTCAT 


CATCTCTCTC 


TCCTACTCTC 


TCTCCCTCCT 


TACTACTTCT CTCTTCCTCT 


8220 




TTTCCTTTAA 


TATATTTTCT 


TTCTTTACTC 


ATCCCTTTTT 


TCACTTTACT 


ATTCCATGTG 


8280 




TATCTCTTCT 


CTCTTCCTTT 


TTTCTCTCTC 


TTTATTTCTC 


ACTTTCTCCC 


TCCTCCACTC 


8340 




TTCTCATCTT 


TTTTTCTCTA 


CTCAACATAT 


CTCTCATCTC 


TCTCTCTCTA 


TTTACTACTC 


8400 




CTACTTCTTT 


TTTCTCTCTC 


TCATATTCTA 


TTTTCACTTT 


TCTTCCTCCT 


TCTCATATCT 


8460 


30 


CACCTCTCGT 


TCTCTCCCTC 


TTTTCATTTC 


ATCTCTTATA 


TCCTCTCTAT 


CTTTATTCCT 


8520 




CCTCTCTCAT 

W W X V* X V» X V*i X 


TPTPTOTCPT 


CTGCTCACAP 


TTACTTACTC 


CTCCTCTCCA 


ATTTGTCTCT 


OjOU 




TCGTCTCTCA 


CTCTTCTCCT 


TTCTCATTTC 


TTTCTATACT 


CTCTCTATCT 


TCGTATCTCT 


8640 




TTTATCTATC 


TACTTCTCTA 


TTCTATCTCC 


TCTATGCTTG 


ACTCTCGTTA 


CATTCTCATC 


8700 




CCTTCGCTCC 


TTATCAATTA 


TCTTCGCACT 


GCTTANGTAT 


TCTCTCTCTC 


TCTCCTCTCT 


8760 


35 


CTCTCATTTT 


CTCCTCTCCT 


GCTTCTCTCT 


CTTTTCTTGT 


GGCTCTATCC 


ACTTCTTCAT 


8820 




TATATTCTCT 


CACTTATTTT 


CTTCTTTCTC 


TCTCACTGCT 


CTCTCACCTC 


TCTCTCTACT 


8880 




TTCTCTCTTC 


CTCTGTTTTC 


TCTCCTCTCT 


TTGTCTATTC 


ATCCCCTCTT 


AACGTTCTTC 


8940 




TTCTCTTCCG 


TC 










8953 



40 {2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 594 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Val Leu Met Gly Arg Leu Arg Asn Ala Pro Glu Ser Leu Thr Tyr 
10 1 5 10 15 

Met Phe Cys Ala Ala lie Arg Val Ala Pro Val Thr Thr Gin Ser Arg 

20 25 30 

Thr Ser Leu Arg Val Cys Thr His Val Leu Phe Pro Asp Pro Ala Leu 
35 40 45 

15 Pro Val Met Arg Tyr Ala Ala Asn Gly Asn Ser Arg Ser Gly Arg Pro 
50 55 60 

Val Gly Thr Ser Lys Ala Ala Thr Ser Arg Asn His Cys Arg Arg Gly 
65 70 75 80 

Thr Cys Val Thr Ser Ser Cys Cys Cys Glu Ser Ser Arg Met Arg Ala 
20 85 90 . 95 

Met He Gly Trp Thr Pro Cys Met Asp Val Lys Phe Lys Asn Ala Ser 

100 105 110 

Ser Leu Asn Arg Thr Ala Gly Leu Ala Pro Gly Cys Cys Gly Gly Gly 
115 120 125 

25 Pro Gly Ala Arg Thr Ser Arg Glu Pro Ser Pro Pro Asp Ala Ala Met 
130 135 140 

Ala Ala Gin Arg Ala Arg Ala Pro Ala Met Arg Thr Arg Gly Gly Asp 
145 150 155 160 

Ala Ala Leu Cys Ala Pro Glu Asp Gly Trp Val Lys Val His Pro Thr 
30 165 170 175 

Pro Gly Thr Met Leu Phe Arg Glu He Leu Leu Gly Gin Met Gly Tyr 

180 185 190 

Thr Glu Gly Gin Gly Val Tyr Asn Val Val Arg Ser Ser Glu Ala Ala 
195 200 205 

35 Thr Arg Gin Leu Gin Ala Ala lie Phe His Ala Leu Leu Asn Ala Thr 
210 215 220 

Tyr Asp Leu Glu Glu Asp Trp Arg Arg His Val Val Arg Leu Gin Pro 
225 230 235 240 

Gin Arg Leu Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly Asp He Ala 
40 245 250 255 

Gly Val Ala Glu Arg Val Phe Asp Thr Trp Arg Cys Thr Leu Arg Thr 

260 265 270 

Thr Leu Leu Asp Phe Ala His Gly Val Val Asn Cys Phe Ala Pro Gly 
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275 280 285 

Gly Pro Ser Gly Pro Thr Ser Phe Pro Lys Tyr lie Asp Trp Leu Thr 

290 295 300 

Cys Leu Gly Leu Val Pro lie Leu Arg Lys Thr Arg Glu Gly Glu Ala 
5 305 310 315 320 

Thr Gin Arg Leu Gly Ala Phe Leu Arg Gin His Thr Leu Pro Arg Gin 

325 330 335 

Leu Ala Thr Val Ala Gly Ala Ala Glu Arg Ala Gly Pro Gly Leu Leu 
340 345 350 

10 Glu Leu Ala Val Ala Phe Asp Ser Thr Arg Met Ala Glu Tyr Asp Arg 
355 360 365 

Val His lie Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu Val Arg Asp 

370 375 380 

Pro Val Ser Gly Gin Arg Gly Glu Cys Leu Val Leu Cys Pro Pro Leu 
15 385 390 395 400 

Trp Thr Gly Asp Arg Leu Val Phe Asp Ser Pro Val Gin Arg Leu Cys 

405 410 415 

Pro Glu He Val Ala Cys His Ala Leu Arg Glu His Ala His He Cys 
420 425 430 

20 Arg Leu Arg Asn Thr Ala Ser Val Lys Val Leu Leu Gly Arg Lys Ser 
435 440 445 

Asp Ser Gly Val Ala Gly Ala Ala Arg Val Val Asn Lys Ala Leu Gly 

450 455 460 

Glu Asp Asp Glu Thr Lys Ala Gly Ser Ala Ala Ser Cys Leu Val Arg 
25 465 470 475 480 

Leu He He Asn Met Lys Gly Met Arg His Val Gly Asp He Asn Asp 

485 490 495 

Thr Val Arg Ala Tyr Leu Asp Glu Ala Gly Gly His Leu He Asp Thr 
500 505 510 

30 Pro Ala Val Asp His Thr Leu Pro Gly Phe Gly Lys Gly Gly Thr Gly 
515 520 525 

Arg Gly Ser Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin Gin Leu Arg 

530 535 540 

Gin Ala Phe Gin Thr Ala Val Val Asn Asn He Asn Gly Met Leu Glu 
35 545 550 555 560 

Gly Tyr He Asn Asn Leu Phe Gly Thr He Glu Arg Leu Arg Glu Thr 

565 570 575 

Asn Ala Gly Leu Ala Thr Gin Leu Gin Arg Gly Ser Ser Arg Ser Thr 
580 585 590 

40 Ala Xaa 



(2) INFORMATION FOR SEQ ID NO: 3: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 877 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Ala Ala Ser Gly Gly Glu Gly Ser Arg Asp Val Arg Ala Pro Gly 

15 10 15 

Pro Pro Pro Gin Gin Pro Gly Ala Arg Pro Ala Val Arg Phe Arg Asp 
15 20 25 30 

Glu Ala Phe Leu Asn Phe Thr Ser Met His Gly Val Gin Pro lie He 

35 40 45 

Ala Arg He Arg Glu Leu Ser Gin Gin Gin Leu Asp Val Thr Gin Val 
50 55 60 

20 Pro Arg Leu Gin Trp Phe Arg Asp Val Ala Ala Leu Glu Val Pro Thr 
65 70 75 80 

Gly Leu Pro Leu Arg Glu Phe Pro Phe Ala Ala Tyr Leu He Thr Gly 

85 90 95 

Asn Ala Gly Ser Gly Lys Ser Thr Cys Val Gin Thr Leu Asn Glu Val 
25 100 105 110 

Leu Asp Cys Val Val Thr Gly Ala Thr Arg He Ala Ala Gin Asn Met 

115 120 125 

Tyr Val Lys Leu Ser Gly Ala Phe Leu Ser Arg Pro He Asn Thr He 
130 135 140 

30 Phe His Glu Phe Gly Phe Arg Gly Asn His Val Gin Ala Gin Leu Gly 
145 150 155 160 

Gin His Pro Tyr Thr Leu Ala Ser Ser Pro Ala Ser Leu Glu Asp Leu 

165 170 175 

Gin Arg Arg Asp Leu Thr Tyr Tyr Trp Glu Val He Leu Asp He Thr 
35 180 185 190 

Lys Arg Ala Ala His Gly Gly Glu Asp Ala Arg Asn Glu Phe His Ala 

195 200 205 

Leu Thr Ala Leu Glu Gin Thr Leu Gly Leu Gly Gin Gly Ala Leu Thr 
210 215 220 

40 Arg Leu Ala Ser Val Thr His Gly Ala Leu Pro Ala Phe Thr Arg Ser 
225 230 235 240 

Asn He He Val He Asp Glu Ala Gly Leu Leu Gly Arg His Leu Leu 
245 250 255 
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Thr Thr Val Val Tyr Cys Trp Trp Met He Asn Ala Leu Tyr His Thr 

260 265 270 

Pro Gin Tyr Ala Gly Arg Leu Arg Pro Val Leu Val Cys Val Gly Ser 
275 280 285 

5 Pro Thr Gin Thr Ala Ser Leu Glu Ser Thr Phe Glu His Gin Lys Leu 
290 295 300 

Arg Cys Ser Val Arg Gin Ser Glu Asn Val Leu Thr Tyr Leu He Cys 
305 310 315 320 

Asn Arg Thr Leu Arg Glu Tyr Thr Arg Leu Ser His Ser Trp Ala He 
10 325 330 335 

Phe He Asn Asn Lys Arg Cys Val Glu His Glu Phe Gly Asn Leu Met 

340 345 350 

Lys Val Leu Glu Tyr Gly Leu Pro He Thr Glu Glu His Met Gin Phe 
355 360 365 

15 Val Asp Arg Phe Val Val Pro Glu Ser Tyr He Thr Asn Pro Ala Asn 
370 375 .380 

Leu Pro Gly Trp Thr Arg Leu Phe Ser Ser His Lys Glu Val Ser Ala 
385 390 395 400 

Tyr Met Ala Lys Leu His Ala Tyr Leu Lys Val Thr Arg Glu Gly Glu 
20 405 410 415 

Phe Val Val Phe Thr Leu Pro Val Leu Thr Phe Val Ser Val Lys Glu 

420 425 430 

Phe Asp Lys Tyr Arg Arg Leu Thr Gin Gin Pro Thr Leu Thr Met Glu 
435 440 445 

25 Lys Trp He Thr Ala Asn Ala Ser Arg He Thr Asn Tyr Ser Gin Ser 
450 455 460 

Gin Asp Gin Asp Ala Gly His Val Arg Cys Glu Val His Ser Lys Gin 
465 470 475 480 

Gin Leu Val Val Ala Arg Asn Asp He Thr Tyr Val Leu Asn Ser Gin 
30 485 490 495 

Val Ala Val Thr Ala Arg Leu Arg Lys Met Val Phe Gly Phe Asp Gly 

500 505 510 

Thr Phe Arg Thr Phe Glu Ala Val Leu Arg Asp Asp Ser Phe Val Lys 
515 520 525 

35 Thr Gin Gly Glu Thr Ser Val Glu Phe Ala Tyr Arg Phe Leu Ser Arg 
530 535 540 

Leu Met Phe Gly Gly Leu He His Phe Tyr Asn Phe Leu Gin Arg Pro 
545 550 555 560 

Gly Leu Asp Ala Thr Gin Arg Thr Leu Ala Tyr Gly Arg Leu Gly Glu 
40 565 570 575 

Leu Thr Ala Glu Leu Leu Ser Leu Arg Arg Asp Ala Ala Gly Ala Ser 

580 585 590 

Ala Thr Arg Ala Ala Asp Thr Ser Asp Arg Ser Pro Gly Glu Arg Ala 

170 



WO 98/20016 



PCT/US97/20016 



595 600 605 

Phe Asn Phe Lys His Leu Gly Pro Arg Asp Gly Gly Pro Asp Asp Phe 

610 615 620 

Pro Asp Asp Asp Leu Asp Val lie Phe Ala Gly Leu Asp Glu Gin Gin 
5 625 630 635 640 

Leu Asp Val Phe Tyr Cys His Tyr Ala Leu Glu Glu Pro Glu Thr Thr 

645 650 655 

Ala Ala Val His Ala Gin Phe Gly Leu Leu Lys Arg Ala Phe Leu Gly 
660 665 670 

10 Arg Tyr Leu lie Leu Arg Glu Leu Phe Gly Glu Val Phe Glu Ser Ala 
675 680 685 

Pro Phe Ser Thr Tyr Val Asp Asn Val He Phe Arg Gly Cys Glu Leu 

690 695 700 

Leu Thr Gly Ser Pro Arg Gly Gly Leu Met Ser Val Gin Thr Asp Asn 
15 705 710 715 720 

Tyr Thr Leu Met Gly Tyr Thr Tyr Thr Arg Val Phe Ala Phe Ala Glu 

725 730 735 

Glu Leu Arg Arg Arg His Ala Thr Ala Gly Val Ala Glu Phe Leu Glu 
740 745 750 

20 Glu Ser Pro Leu Pro Tyr He Val Leu Arg Asp Gin His Gly Phe Met 
755 760 765 

Ser Val Val Asn Thr Asn He Ser Glu Phe Val Glu Ser He Asp Ser 

770 775 780 

Thr Glu Leu Ala Met Ala He Asn Ala Asp Tyr Gly He Ser Ser Lys 
25 785 790 795 800 

Leu Ala Met Thr He Thr Arg Ser Gin Gly Leu Ser Leu Asp Lys Val 

805 810 815 

Ala lie Cys Phe Thr Pro Gly Asn Leu Arg Leu Asn Ser Ala Tyr Val 
820 825 830 

30 Ala Met Ser Arg Thr Thr Ser Ser Glu Phe Leu His Met Asn Leu Asn 
835 840 845 

Pro Leu Arg Glu Arg His Glu Arg Asp Asp Val He Ser Glu His lie 

850 855 860 

Leu Ser Ala Leu Arg Asp Pro Asn Val Val lie Val Tyr 
35 865 870 875 

(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

171 



WO 98/20016 



PCIYUS97/20016 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 

5 

Met Gly Asn Pro Gin Thr Thr He Ala Tyr Ser Leu His His Pro Arg. 

1. 5 10 15 

Ala Ser Leu Thr Ser Ala Leu Pro Asp Ala Ala Gin Val Val His Val 
20 25 30 

10 Phe Glu Ser Gly Thr Arg Ala Val Leu Thr Arg Gly Arg Ala Arg Gin 
35 40 45 

Asp Arg Leu Pro Arg Gly Gly Val Val He Gin His Thr Pro He Gly 

50 55 60 

Leu Leu Val He He Asp Cys Arg Ala Glu Phe Cys Ala Tyr Arg Phe 
15 65 70 75 80 

lie Gly Arg Ala Ser Thr Gin Arg Leu Glu Arg Trp Trp Asp Ala His 

85 90 95 

Met Tyr Ala Tyr Pro Phe Asp Ser Trp Val Ser Ser Ser His Gly Glu 
100 105 110 

20 Ser Val Arg Ser Ala Thr Ala Gly He Leu Thr Val Val Trp Thr Pro 
115 120 125 

Asp Thr He Tyr He Thr Ala Thr lie Tyr Gly Thr Ala Pro Glu Ala 

130 135 140 

Arg Cys Asp Asn Ala Pro Leu Asp Val Arg Pro Thr Thr Pro Pro Ala 
25 145 150 155 160 

Pro Val Ser Pro Thr Ala Gly Glu Phe Pro Ala Asn Thr Thr Asp Leu 

165 170 175 

Leu Val Glu Val Leu Arg Glu He Gin He Ser Pro Thr Leu Asp Asp 
180 185 190 

30 Ala Asp Pro Thr Pro Gly Thr 
195 

(2) INFORMATION FOR SEQ ID NO: 5: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 172 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
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Val Gly Pro Leu Asp Gly Glu Pro Asp Arg Asp Ala lie Ser Pro Leu 

15 10 15 

Thr Ser Ser Val Ala Gly Asp Pro Pro Gly Ala Asp Gly Pro Tyr Val 
5 20 25 30 

Thr Phe Asp Thr Leu Phe Met Val Ser Ser lie Asp Glu Leu Gly Arg 

35 40 45 

Arg Gin Leu Thr Asp Thr lie Arg Lys Asp Leu Arg Leu Ser Leu Ala 
50 55 60 

10 Lys Phe Ser lie Ala Cys Thr Lys Thr Ser Ser Phe Ser Gly Thr Ala 
65 70 75 80 

Ala Arg Gin Arg Lys Arg Gly Ala Pro Pro Gin Arg Thr Cys Val Pro 

85 90 95 

Arg Ser Asn Lys Ser Leu Gin Met Phe Val Leu Cys Lys Arg Ala Asn 
15 100 105 110 

Ala Ala Gin Val Arg Glu Gin Leu Arg Ala Val lie Arg Ser Arg Lys 

115 120 125" 

Pro Arg Lys Tyr Tyr Thr Arg Ser Ser Asp Gly Arg Leu Cys Pro Ala 
130 135 140 

20 Val Pro Val Phe Val His Glu Phe Val Ser Ser Glu Pro Met Arg Leu 
145 150 155 160 

His Arg Asp Asn Val Met Leu Ser Thr Glu Pro Asp 
165 170 

25 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Lys Arg Ala Arg Ser Arg Ser Pro Ser Pro Pro Ser Arg Pro Ser 

15 10 15 

Ser Pro Phe Arg Thr Pro Pro His Gly Gly Ser Pro Arg Arg Glu Val 
40 20- 25 30 

Gly Ala Gly lie Leu Ala Ser Asp Ala Thr Ser His Val Cys He Ala 

35 40 45 

Ser His Pro Gly Ser Gly Ala Gly Tyr Pro Thr Arg Leu Ala Ala Gly 
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50 55 60 

Ser Ala Val Gin Arg Arg Arg Pro Arg Gly Cys Pro Pro Gly Val Met 
65 70 75 80 

Phe Ser Ala Ser Thr Thr Pro Glu Gin Pro Leu Gly Leu Ser Gly Asp 
5 85 90 95 

Ala Thr Pro Pro Leu Pro Thr Ser Val Pro Leu Asp Trp Ala Ala Phe 

100 105 110 

Arg Arg Ala Phe Leu lie Asp Asp Ala Trp Arg Pro Leu Leu Glu Pro 
115 120 125 

10 Glu Leu Ala Asn Pro Leu Thr Ala Arg Leu Leu Ala Glu Tyr Asp Arg 
130 135 140 

Arg Cys Gin Thr Glu Glu Val Leu Pro Pro Arg Glu Asp Val Phe Ser 
145 150 155 160 

Trp Thr Arg Tyr Cys Thr Pro Asp Asp Val Arg Val Val lie He Gly 
15 165 170 175 

Gin Asp Pro Tyr His His Pro Gly Gin Ala His Gly Leu Ala Phe Ser 

180 185 190 

Val Arg Ala Asp Val Pro Val Pro Pro Ser Leu Arg Asn Val Leu Ala 
195 200 205 

20 Ala Val Lys Asn Cys Tyr Pro Asp Ala Arg Met Ser Gly Arg Gly Cys 
210 215 220 

Leu Glu Lys Trp Ala Arg Asp Gly Val Leu Leu Leu Asn Thr Thr Leu 
225 230 235 240 

Thr Val Lys Arg Gly Ala Ala Ala Ser His Ser Lys Leu Gly Trp Asp 
25 245 250 255 

Arg Phe Val Gly Gly Val Val Arg Arg Leu Ala Ala Arg Arg Pro Gly 

260 265 270 

Leu Val Phe Met Leu Trp Gly Ala His Ala Gin Asn Ala He Arg Pro 
275 280 285 

30 Asp Pro Arg Gin His Tyr Val Leu Lys Phe Ser His Pro Ser Pro Leu 
290 295 300 

Ser Lys Val Pro Phe Gly Thr Cys Gin His Phe Leu Ala Ala Asn Arg 
305 310 315 320 

Tyr Leu Glu Thr Arg Asp He Met Pro He Asp Trp Ser Val 
35 325 330 



(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 183 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

5 

Val Pro Cys Met Arg Thr Pro Ala Asp Asp Val Ser Trp Arg Tyr Glu 

15 10 15 

Ala Pro Ser Val lie Asp Tyr Ala Arg He Asp Gly He Phe Leu Arg 
20 25 30 

10 Tyr His Cys Pro Gly Leu Asp Thr Phe Leu Trp Asp Arg His Ala Gin 
35 40 45 

Arg Ala Tyr Leu Val Asn Pro Phe Leu Phe Ala Gly Gly Phe Leu Glu 

50 55 60 

Asp Leu Ser His Ser Val Phe Pro Ala Asp Thr Gin Glu Thr Thr Thr 
15 65 70 75 80 

Arg Arg Ala Leu Tyr Lys Glu He Arg Asp Ala Leu Gly Ser Arg Lys 

85 90 95 

Gin Ala Val Ser His Ala Pro Val Arg Ala Gly Cys Val Asn Phe Asp 
100 105 110 

20 Tyr Ser Arg Thr Arg Arg Cys Val Gly Arg Arg Asp Leu Arg Pro Ala 
115 120 125 

Asn Thr Thr Ser Thr Trp Glu Pro Pro Val Ser Ser Asp Asp Glu Ala 

130 135 140 

Ser Ser Gin Ser Lys Pro Leu Ala Thr Gin Pro Pro Val Leu Ala Leu 
25 145 150 155 160 

Ser Asn Ala Pro Pro Arg Arg Val Ser Pro Thr Arg Gly Arg Arg Arg 

165 170 175 

His Thr Arg Leu Arg Arg Asn* * 
180 



30 



(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9218 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



CCTCCGGACG TGCGATCGGA TCCCGCGAGT CGAAATCCCA CACAGCAGAC CCGTGGGTGT 60 
GCTAGATCGA ACGAGCGGCA GGATCGCGTG CTGGCCCCTT GATACGATCT CGTCGACCGG 120 
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GGACTCCCCT CTACCCCCAC CCAACCAGCG CGCCGGCGCT TAGGGTGTGA CCCCCCCCAT 180 

GGCATCCGGG GTTTCCCCGG CCCACCCCCA AACCCCGGTT GGGGCGGGCA GCCGAGACCT 240 

TTCCTTAAAA GGCACCCCAT CCGACGGCAT GCAGCCCAGA GGAGCGGACA CGCTTGAAGG 300 

GCACTCGCTT CCGACCGACG GGCCCCCGCA CCGGGGCGGC GACCATGATC CGGCGGCGGG 360 

5 GAAACGTGGA GATTCGGGTC TACTACGAGT CTGTGCGGCC CTCTCGATCC CGAAGCCATC 420 

TGAAGCCGTC CGACCATCAA GAATTCCCAG GGCACCACGT GTCCCCAGGG AGCCCCGGGT 480 

TCCCCGAGAG CCCAGGGAAC CGCGAGTTCC ACGATCTCCC AGAGAACCCA GGGTCCCGCG 540 

CATACCCAGG GACCCGCGAC CCCCACGACC CCCACGGGTG CCCAGGGAGC CTAGACCCCC 600 

ACGGGAACCC CGCGCAACCC GCGGGCTTGC CTAGCCCGGT CCCCTACGCC CCCCTCGGCA 660 

10 GCCCGGACCC CTCATCGCCG CGCCAACGCA CGTACGTTCT GCCCCGCGTC GGGATCCGTA 720 

ACGCGCCCGC GTCCGACACC CGGGCCCCAA AGCGTGCCCA CTCGCGGCAC CGCGCGGACC 780 

GGCCCCCGGA GTCCCCCGGC TCCGAGTTGT ACCCTCTCAA CGCCCAGGCC CTGGCGCACC 840 

TGCAGATGCT GCCCGCGGAC CACCGGGCCT TTTTTCGGAC GGTGATCGAG GTGTCCCGCC 900 

TGTGTGCTCT CAACACCCAC GACCCACCGC CCCCGCTGGC GGGAGCCAGG GTCGGACAGG 960 

15 AGGCGCAGCT GGTTCATACC CAATGGCTTC GGGCCAACAG GGAGTCCTCG CCGCTGTGGC 1020 

CCTGGCGGAC GGCCGCCATG AATTTTATCG CCGCGGCTGC GCCCTGCGTC CAAACACATC 1080 

GCCATATGCA CGACCTGCTG ATGGCATGCG CCTTCTGGTG CTGTTTGGCG CACGCGTCGA 1140 

CGTGTTCCTA CGCGGGGTTA TATTCGGCAC ACTGCCAGCA TTTGTTTCGT GCGTTTGGGT 1200 

GCGGACCCCC GGTCCTGACC ACGTCCCGGG GACAGGGTGG TTGGTGTAAT TAATAATAAA 1260 

20 ATCGTGAAAA TTGAAATCGC TTTGTGTGTT GCTGCGGGGA CGGGGGCAAA TGCGTCGTGA 1320 

CTCTAGAACG CCAGATGTGG GGTGCGGATG GGGAAATGTA TGGGTCCTTC GTCTGGAGCC 1380 

CGTACCCGGC AGAGAGATTT CCCCAGCACG GAGGAACTGG GGTACTGCAC TGCCCCCCTC 1440 

CTGGGGGGGG GGGGGGCGAG AGGTCAATAG ATTTCCCCAA GAGACTTCCC TAACACGGAG 1500 

GAGCCGGGAG AGTTCAATAG ATTTCCCAAA CACTGAGGAA CTGGGGTACT GCACTGCCCC 1560 

25 CTCCTGGGGG GGGGGGCGAG AGGTCAATAG ATTTCCCCAA GAGACTTCCC TAACACGGAG 1620 

GAGCCGGGAG AGTTCAATAG ATTTCCCAAA CACTGAGGAA CTGGGGTACT GCACTTGCCC 1680 

CCCCCGGGGG GTGAAATTCC GAGAATTTTT TACCCTTTTT TGCATTTCCT TCCCCCCCCC 1740 

CCCCAAAAAA AAAGACAACC TAGTAGACCG TAATGACAAT CAACCACTTT ATTGCAATTA 1800 

ACATACGGAC GTGGGTCGCG GCGAGGGGTG GGGGCGAAGA AGGCGCCATA CATCGAGGCG 1860 

30 TCATTTAGCG GAGCAGCCAC ACCAAAAGTG CCCCGAACCC TCCAGATAGG AGGGCCACGA 1920 

CGAGACAGGC GATAACCAGC CCGACGCACC GCGTGCGCCG CCGTCGGCGC CTTAGGACCG 1980 

ACTGCTGGCG GCCCATGCGC ACGAGGAAGT CGTTGGCGGC CTCGTCTTCG CTTTCCGAGT 2040 

AGTAGGCTTC TGCCGGGACG GGCGAGGCCG CGGGGTAAAG CGGCACCGAC GCGCTGGAAC 2100 

GCACCGAGTC TTGGTCGGCG GGCCGGGAGG TCATCGCGGA CGCGGAAGGG CGCTGGCGGA 2160 

35 GGGCCGGAGG CGAAGGTGCG GTTGCCGTGA CTCACGATTT TTATGAGCTG CGGCGGGGCT 2220 

GGCCGCCGGA CCTTTATGCG CCTCGGGCGA TTGACGTCAC GTAAAACGCA ATCCCGCACA 2280 

GGACGGCCCC GAGACCCACC GCCCCCCGCA GCCAGCGCAC GGCGAGCCAG GTGACGAATT 2340 

GGGAGGGGGC GTCCACGGCG TGGAGGGCCA CGGGAAAGGC CGCGGGGGAG CCGCCGCGAG 2400 

GTGGTCTGCG GCACGCGGGC GCGGCGCCGC CCGCGCCGGG GGGCAGGGTC TCTGGCGGGT 2460 

40 CCCCGCGTGC GTCCGCGATG GCAATCAGTT CATCGCCGAC GTCCGCGTCG TCGGAAGACG 2520 

CCTTACCAGA GGACGGACGA ATAGGAGGCC TGGGAGTGAC GACGGCCCGG GCTTCCCGAA 2580 

CCAAAGGTGG TGAGCGGGCG GCGAGATTTA CGCCCCTCGC TATGGGGGTA TACAGACGGA 2640 

GCCGTTGGTG ATAAGATCTC AAAGCCGGAT CCATTTGTGG AGGGAGAGTC GGGTCTCTCC 2700 
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GGAGGGTCCT GCCACAGGGA CCCGTCGCGC TCCCCCTCGC TGTCCGAACT CCAGTCCGCG 2760 

TACAGCTCGC TGTCCGCCAC GCGAATGTAA GTGGGGCCCG TCGCCGAGGC CCGGCTTTTA 2820 

ACCGCCCGCC AGGAGCGCCT GCGCCAGCAG GTCATGCACG CCCACGCGGA CAGCCCGAGG 2880 

GCGGCCAGCA ACAGGGCCGC CCCCAGCACC GCCCCGAGGC GCAGCGGGCC GCGCGCGGAG 2940 

5 GGCGCGGGAG GGGGGGCTCT CACGTGCGGG CGGGTGGGCT CGACGGGCTC GGGCTGGCGC 3000 

TGGGGGAGGT GCTGTTCCAC CACCGCGTTC CGGTACTGCG CCGCGGTGCT GATGGTCATG 3060 

TGGCCCCAGG CGTGGATATG ATCGTCCACG TACACCACGC ACAGGTAGAG GCCGGCGTGC 3120 

TGGGGGGAGG CGTGCTGGAA TTCCAGATTG ACGGTGGAGG CCAGCCACGC CAACCCCGGG 3180 

ACCGGTTCCA TGCGAGCCTC GGCAAAACAT CGCGGCGGGG GCGTAGTCCT GGAACAGCCG 3240 

10 GCGTAGCTGC GGACCGCCAG GCGGTACGCC CAGGAACTTA CGGCGCACGG CGCGTCGGCC 3300 

GGAGATAGAC ACTCTGGAAG CTGCGGGTGA TACAGACAAG CTTCGTAGAT CCGCATCTCG 3360 

GCGCACGAGG ACGGCACGTC AAACCGCATC CAGACGACGT CCATGGCGTA CGGACCGTCG 3420 

TCGTGGGCTA CGGCGTGGAT GGAGACCTTC GTCTCAAACG TCTCCCCGGG GGCAAACATA 3480 

ATGGCCTCCG GGGTCTCCAT ATGGACCGTT ACCCCGCGCA CGTGGGACAC CTCGGGGATA 3540 

15 ACACGATGGT GCCTCGGGGG GGCCACGGGG GGACCAACGG GGGGGGGTTG GGGGGGGAAC 3 600 

GCTGACCGGC GTGCGTTCGC TCACGCCCGC GTCGTCTTCT TCGTCGTAGT CGTCGGGGGT 3660 

CGGGGTCGGC AC AGGGGCGG GCTCCACGAC CAGAACCACC GACGCCACTT GGCGCGCCTC 3720 

GTCGCTTAGG CCGACCACGG ACAGGGTGTA CAGACCGCTG TCCGTCTCCA GGGCCCCGTA 3780 

GATGACCAGA CTCTCGTTGA CCACGGCTAC GCGATCGCGC CACGCCAACT CCGAATACAG 3840 

20 TCCCTCGTCG CCCGCGGGGA ACGGGGGACT GTATGCTATG GCGAGCGGTT CCGGGGCGCG 3900 

CATGCACGCC GCATCCACGA CCGTCTCGAG CACCCGTCGG GGGGGCCACA GCGCCACCCA 3960 

CGACGGGCGC AGGGGACCGC AGGCATCCAG GGGTTCCGCG GCCCACAGTA GTTTGTGGGC 4020 

CCGGGTGCGT TCCTCCGGCC CCGCGGGCGC CGGAAGCAAC ACCACGTCCT CGCCCGAGGT 4080 

TACCCGTTTC CAGGACGTTC TGGGTGCTGC CGCCAGGCAC GATACGACCC AAACTCCAAC 4140 

25 AAAAAACACC AACCCGGCCC CGCGAGCCAT GTTCGGGTGG CAGGAGCCGT CGGTCGGGGC 4200 

AGATCGGAGA CTAGCTGACG GCGGCGCACC AAGTCACCCG AAGACACAGA GTCGGGGCGG 4260 

CGACTCCTTA AATGCGCGGC GGGCCTCTCC GACACTACCC CCTTTATTCT TTTTCCTCCC 4320 

CCCCCGGGCC CCGCCCATCC ATTACCCGCC TCCCATGCCA tCCGGGGAAT GACGAACGAT 4380 

CACAAAGGGA TCCAACACAC GCATATAGGC AAATAACATC GGTTTATTGG GGGGGAAATA 4440 

30 ACCACGATGG GGGCGGTGGG GCGGGCCTGC CGAACGGCCC GCTTGGACCT AAACCTCTTG 4500 

GGGGGCCGTC GGGCCACTGC GGGGCCGAGG ACTGACGGAC AGCAGCACGA CTGGACCTGG 4560 

CTCCGATTCC TCAGCTATCG ACGTTAGGGA AGGCATGGTC GTGGACGACG ACGAACGGCG 4620 

TCGGGGTTTG GGGGGGGTGT TTGGGTGGGA TCGCAGCTCG GCTCCGAGGC GGGCCATGGC 4680 

CGCCTCGTTG ACCGCGCAGG AAACGCCCCC GGGGTTGTAA ATCTGGCCGC GGGGGCGCCT 4740 

35 GTATCGGCGC TGGCATCTAT GGATGAAGCA GATACAGCTG CCCAGAAACA CAAAGGCGAT 4800 

GATGGACGCC GGTATGGCGA TCTGGATTAC CTGGGCTACG GTTAGCCTGT GTCTCGATTC 4860 

GCTGGCCGAT CGCGTGGAAT TGGGCGGGGC TCTCTCGCCG CTCGCGGGCG CGGGCGTCCC 4920 

TGTGTCCCCG GGGGCGGGGG TCGGGTCTCG GGGGGAGGAC GGGGATGTCG TTGTCCGTGG 4980 

AGGGGTGGGC CGGGAGGCTC CGGGGGTGTA TACGCTCGAG GGTCCCAGGC GCGGGGCCGA 5040 

40 AAAGGGAAGC TGCGCCGGAT CGCAGGAGCC GTAGTCCGAG CCGTTATACA CAAACGTCCC 5100 

GTTGGCAGAG AGCGCCACCC CCAAAACAAA CAGGCTGGCG TTCGTCGCGC TGCCGACCCA 5160 

TACGCGCAGG ACATACAGAC CGGCATAGTC GCGCGTTGCC GTTCGAACCC GCAGAAGCGG 5220 

CTGCCGCGCC AGACCCAGCT CCAGGGTCGG ATAGGCGGGG CTGTGGGCGT GGTGCGTCGA 5280 
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GCGACACAAG GTGAACGCCA CGGCGGGGCG GCGGGGGCAT GCGGTCAGTG 
TACAACGCGG GGGCAGTGGT TCCCCAGGGG GTAGTGAAAC AGCTCGATGA 
GTAGTTTGTG TGGGGGACCT GGGCCCCCAC AAAATGAAGC TCCCCGAAAA 
CTCTTCCACG AAGCCCTGGG GCCCCACGGC CCCGGCATCC ACGAGTGAGT 
5 ACTGACCGTG GGGCCGCGGA CGACCAGGCC GGTGGCGCAG ACCCACAGGC 
CAGGCCCTGC AGCGAGCGGC CGGGCATACC GGGATCGGAC GGGTCGAGGT 
GCGGTGGCTA ATCGTCGGGA CAGCGGTGCG CGCCCCACGC TCCCGGCCTA 
TAGTTTCCCT CCTTCGAGAC TCCCTTTATG CGGAGTCCAA GTCCCACCCA 
CCACCCTCCC ACACGGGCCC AGAGGTACAC GGGAGCGGGG ATACTCCTCT 

10 TGGCTGGTGC GAGGGGGGCG CGTCGTCATC CCGGATGTGG GGGAGACGTA 
GGCCATCTGA GCGCGGCGGC GTACCCAAAA CGCAATACCG CCGATGACCA 
GGTACTGCCG GCCAGCGCGC CGATGATCAG GCCCGGGTTG CTGGGGGCGG 
GTGCGGCGCG ACGTCCTGGA TCGACGGGAT GTGCCAGTTT GGGGGGATCT 
CGTCCCGGCG GGATCCTCTA AGAGGGCCGA GTCCTCGGGG TCTTCCGGAA 

15 TTGCGTGGCG TTGGTGGTGT CGGACAGCTC CGGCGGCAGC AGGGTGCTGG 
CTTGGGGCCG TGCCACCCGG CGATTTTTAA GCTGTATAGG GCGACGGTGC 
GGGGATAAAG CGGGGGAGCA TCCCGATGCT GTCGACCGTC ACGCCCTGTT 
CGAGGTGAGG CACGCTGCCG GGGGGATGCG CAGGGGGAGA GCGTACTTGC 
GGCCCGGTGC TCCAAAATAA ATTGTGTGAT CTCCGTCCAG TCGTTTATCT 

20 CAGGTACGTA CCCGCGGTCT CGAAGGCGGG GGCGTGCATC AGGAATCCCA 
GCTGACGGCG CTAAAGCTGT CATAGTAGCT CCAGCGGGGC TGCGTTCGGA 
CCCCAACGAC TTGTTGTAGG GGCACTCGGT GTATTCCATA ACCGTGATGG 
ATTGTCTCCC ATGCGATACC AGGCGATGGT CAGGTTGTAC GTGTGCTTTC 
CGAAGCCCCG CGCACGATCT GGGGGGCCTC CGATGGGGCA TGTAGGAGCA 

25 GGCACGTTCC AGCACTGCGT AGTACACAGT GATCGGGATG CTGGGGGGCT 
CTCCAGGCTC GGCTGAATGT GGTAAACACG CTTCACCCCG GGGGGGTCGG 
CAAAACCGGA AGGTTCTTCC CGCGAAATCG ATTGGGATCG GCCATCTTAA 
TGCTAAGGCG TATTTGGCGC AGACGACGCG GAGTCCCACC GCGACAACTA 
CGTCCCGACG CCGGAGGTCA AACGCCCCAT GCCGTGATAC GCGATGCACA 

30 CGACTAGTCG TTCGCAATGC AGCTTATGAC CGAACACCAC ACCGACCCCG 
ACAAAGACTC TATTATACTC CTCCTCCTCG TAAAAATGGA ACCTCCCCTC 
TTGGTTGCAT ATGTGGTCGA ATCGGAGTAT GGTGGTGCGG TGGGTCCGCC 
TTGTGGGATG GGTTGTGGGC TTGATGTGTT TTTAGTTTCT TTCCCCCCCC 
GTACGCTTGG GGATCGCCGG ATTATTTCGT CTTTCCCGCA CAACCCATGC 

35 GCGTGTGGCA ACAACCAAAG TTATATTACC GACCGCTCCA TAGCTGCTGT 
CCCGACACAC AATCGGGGCG ATGGGGTGGG GGCAAGGCCA GAAAGGCGAA 
GCAAATTGGC CCGCGTGGGG GCAGCACCTC GCCAACTCGC GACCCAGGCG 
TCGAGTAGAC ACGCCATCCC CAGAATCATG AGACACAGCC CCCCCACGAT 
GCAAAGCCCC CGGAACGGAT GAGTGGGGGG TGCGTGGGGA GGCGTGCGGT 

40 GTATCGGACG CGGGGCCGGT GGGTGCGGCC CCAACAGCAG CACACCCGAG 
ATCCCCCAGG TCCGAACGGC ATACCGATCC ATTGAGACCA AAACAACAGG 
GGCGGCGGTC AAGGTTTTTG TTTTTGTTCG GGGACCCGGG TGACTTCGTC 
TCTCTGTGTG GCCAAAAGTT GCGCGTCTCG AGGGCCCCGG GACACGTCTT 
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TGACCACGTG 

TGCCGTCGTA 

CACGCAGGTC 

CTGAGACCAG 

CCAGGATCGC 

GACTGTGGGC 

TGAACTGTCC 

AATACCCCAG 

AGTAAAACAA 

GGCGCTTGGG 

GCACCGCCAG 

CGGGGGCGTG 

GCGAAGACAC 

CGAGTTCGGG 

TGTACGGGGG 

GCTGGTTTTC 

GGTAGGCCTT 

AGGAGGCGCG 

TCACTAGCCG 

GGTTATCCTC 

TGGGGCAGAC 

GGATAGCGCA 

GGGCCTCGTC 

CGCTGCGGCA 

GGAACGGGTC 

TCAGCTGGTC 

GCGAGGGGTC 

GCAGGGCCGC 

CGAAAAACGG 

GGTTTTAAAC 

TGGGGGTGAT 

ATAACCCCCC 

CCCCAAGTTG 

CCGGCACGGG 

ACCCCGGGCA 

AAATCATGGG 

ACGCAGGACC 

GAGGGGGACG 

CGCGTTTGTT 

GATTCCCACA 

CACGCCCCCC 

TGGGGGCCTT 

TTAAAAGACT 
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CTGTCACCCA CTGATACCTC CCACCCCCCG CCCCCAGTCC CACAAAACAC AACCAAACTC 7920 

ACATATCCGA TTGACGTCAC AGGTTTATTG TTCTTATCGT GGCATTTGGT CGCTGTTCCC 7980 

TTTCCCGTCC TTCATCGT.TT CTCGCCCCCA CCCCACCCCC TAATCCCGCT CGGGTGGCAG 8040 

ACATACGTAA CGCACGCTCG GGTGCGCGTA TCGCCTGCGC CCCGCCCGGC CGCGCCAAAG 8100 

5 TTGTGCTGCC AAGGCGACCA GACAAACGAA CGCCGCCGTG TGGATGGTGG TGCTGATGAT 8160 

AAAGAGGATA TCTAGAGCAG GGGAGGCCGT TAGGAACCAG AACAGGGGGA TGTGTTGGGG 8220 

TGTGGGGCCC GAGGGCATGT CCTTAGCGGG AGCTTGGGCG GGGGGGCGAG GCGTGTTGGG 8280 

GGCGAGCGGC CCAAGAATTC CTGGCGGGAG CGTGGGGCGG ATGGGCCCGG GGCGCGCGGG 8340 

GGGTGGTTTG TTGGGGTTCG GAGTTCGGAA GGCGAGGCCG GTGGCGCTGT TGTTGTC ATC 8400 

10 GGGGGGTTCG CCGTCCCCGG CGCCCTCAAA CTCCTCGGGT CCGCCGCGAT GTTCGGGGGG 8460 

TGGGGGGGCT GGCGAGCCGG GGGGAGCGTC CGCGGGTCCG TGTGGGTGCG TCTTTGGGTC 8520 

CGTTGGGGGG GTACGGGCGG TGCCGCGGGT TCCGGGCGTG GCGGTGGTCG CGGCAACCGA 8580 

AACGTTGGCG GCCGAGGGCC CCGGCGCGGT ACCGGGGGGC GAAGCGGTGA GGGGGGAATC 8640 

GGCCGTGGGT GCGGCGGAAG CGCCCACCGG ACCCGGGGTT GCGGGTCCGG GAGGGGTTGT 8700 

15 TTGGGGCCCC GGATTCCTGG GGCGGGGGGT CACGTGGGTA AACGTGGGCG GGGGGGTCGT 8760 

GGGGGCTGGT GTGGTGGGGG GCGTTTGCGC TGCGGGGGCG CTGCTGGTGT TCGTGTGCCC 8820 

GGCCCCGGGC GTTGCCGCCG CGGCGGGGAT TGGCTACTAC TCCACGGATG CATTCCCGGG 8880 

CGGGGATGCA ACTGCCGTTT CCTCCGGCGT AACGGCGACC GTTGCGGCTT GTGTGGCCCT 8940 

CTCGTCGGGG GGAGTATTGG TTGCGGGGGC GGTCGGTCCC CCCCTTGGGT TGACTGATGG 9000 

20 CCCCATGGCG GTGGGGTAAA GAGGGAGGGG GGTTTTTTGG AGAGGGGAAG TTGGGGAAGG 9060 

GGAGGAAGGT TTNTGGGGGA GGGGTAAGAG GGGGGGNNNG GGNGAGGGGG GGNNAAAGGG 9120 

TGGGGGGGAG GGNGGGGGGG GCTGTNCCCA CGCTCCCCCC CCCCCCCCGC CCCGGTTCGC 9180 

AAGCGGGCTT TCGTACCTAC ACCCAGGGCC CCCGGCT 9218 

25 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 296 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met He Arg Arg Arg Gly Asn Val Glu He Arg Val Tyr Tyr Glu Ser 

1 5 10 15 

Val Arg Pro Ser Arg Ser Arg Ser His Leu Lys Pro Ser Asp His Gin. 
40 20 25 30 

Glu Phe Pro Gly His His Val Ser Pro Gly Ser Pro Gly Phe Pro Glu 

35 40 45 

Ser Pro Gly Asn Arg Glu Phe His Asp Leu Pro Glu Asn Pro Gly Ser 
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50 55 60 

Arg Ala Tyr Pro Gly Thr Arg Asp Pro His Asp Pro His Gly Cys Pro 
65 70 75 80 

Gly Ser Leu Asp Pro His Gly Asn Pro Ala Gin Pro Ala Gly Leu Pro 
5 85 90 95 

Ser Pro Val Pro Tyr Ala Pro Leu Gly Ser Pro Asp Pro Ser Ser Pro 

100 105 110 

Arg Gin Arg Thr Tyr Val Leu Pro Arg Val Gly lie Arg Asn Ala Pro 
115 120 125 

10 Ala Ser Asp Thr Arg Ala Pro Lys Arg Ala His Ser Arg His Arg Ala 
130 135 140 

Asp Arg Pro Pro Glu Ser Pro Gly Ser Glu Leu Tyr Pro Leu Asn Ala 
145 150 155 160 

Gin Ala His Leu Gin Met Leu Pro Ala Asp His Arg Ala Phe Phe Arg 
15 165 170 175 

Thr Val lie Glu Val Ser Arg Leu Cys Ala Leu Asn Thr His Asp Pro 

180 185 190 

Pro Pro Pro Leu Ala Gly Ala Arg Val Gly Gin Glu Ala Gin Leu Val 
195 200 205 

20 His Thr Gin Trp Leu Arg Ala Asn Arg Glu Ser Ser Pro Leu Trp Pro 
210 215 220 

Trp Arg Thr Ala Ala Met Asn Phe lie Ala Ala Ala Ala Pro Cys Val 
225 230 235 240 

Gin Thr His Met His Asp Leu Leu Met Ala Cys Ala Phe Trp Cys Cys 
25 245 250 255 

Leu Ala His Ala Ser Thr Cys Ser Tyr Ala Gly Ser Ala His Cys Gin 

260 265 270 

His Leu Phe Arg Ala Phe Gly Cys Gly Pro Pro Val Leu Thr Thr Ser 
275 280 285 

30 Arg Gly Gin Gly Gly Trp Cys Asn 
290 295 



(2) INFORMATION FOR SEQ ID NO: 10: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

180 



WO 98/20016 



PCT/US97/20016 



Met Thr Ser Arg Pro Ala Asp Gin Asp Ser Val Arg Ser Ser Ala Ser 

15 10 15 

Val Pro Leu Tyr Pro Ala Asp Val Pro Ala Glu Ala Tyr Tyr Ser Glu 
5 20 25 30 

Ser Glu Asp Glu Ala Ala Asn Asp Phe Leu Val Arg Met Gly Arg Gin 

35 40 45 

Gin Ser Val Leu Arg Arg Arg Arg Arg Arg Thr Arg Cys Val Gly Leu 
50 55 60 

10 Val He Ala Cys Leu Val Val Leu Ser Gly Gly Phe Gly Ala Leu Leu 
65 70 75 80 

Val Trp Leu Leu Arg 
85 

15 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Val His Ala Val Asp Ala Pro Ser Gin Phe Val Thr Trp Leu Ala Val 

15 10 15 

Arg Trp Leu Arg Gly Ala Val Gly Leu Gly Ala Val Leu Cys Gly He 
30 20 25 30 

Ala Phe Tyr Val Thr Ser He Arg Ala 
35 40 



35 



(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 337 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



181 



WO 98/20016 



PCTAJS97/20016 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

Val Ala Pro Pro Arg His His Arg Val He Pro Glu Val Ser His Val 
15 10 15 

5 Arg Gly Val Thr Val His Met Pro Glu Ala He Met Phe Ala Pro Gly 
20 25 30 

Glu Thr Phe Glu Thr Lys Val Ser He His Ala Val Ala His Asp Asp 

35 40 45 

Gly Pro Tyr Ala Met Asp Val Val Trp Met Arg Phe Asp Val Pro Ser 
10 50 55 60 

Ser Cys Ala Glu Met Arg He Tyr Glu Ala Cys Leu Tyr His Pro Gin 
65 70 75 80 

Leu Pro Glu Cys Leu Ser Pro Ala Asp Ala Pro Cys Ala Val Ser Ser 
85 90 95 ' 

15 Trp Ala Tyr Arg Leu Ala Val Arg Ser Tyr Ala Gly Cys Ser Arg Thr 
100 105 110 

Thr Pro Pro Pro Arg Cys Phe Ala Glu Ala Arg Met Glu Pro Val Pro 

115 120 125 

Gly Leu Ala Trp Leu Ala Ser Thr Val Asn Leu Glu Phe Gin His Asp 
20 130 135 140 

Gin His Ala Gly Leu Cys Val Val Tyr Val Asp Asp His He His Ala 
145 150 155 160 

Trp Gly His Met Thr He Ser Thr Ala Ala Gin Tyr Arg Asn Ala Val 
165 170 175 

25 Val Glu Gin His Leu Pro Gin Arg Gin Pro Glu Pro Val Glu Pro Trp 
180 185 190 

His Val Arg Ala Pro Pro Pro Ala Pro Ser Arg Pro Leu Arg Leu Gly 

195 200 205 

Ala Val Leu Gly Ala Ala Leu Leu Leu Ala Ala Leu Gly Leu Ser Ala 
30 210 215 220 

Trp Ala Cys Met Thr Cys Trp Arg Arg Arg Ser Trp Arg Ala Val Lys 
225 230 235 240 

Ser Arg Ala Ser Ala Thr Gly Pro Thr Tyr lie Arg Val Ala Asp Ser 
245 250 255 

35 Glu Leu Tyr Ala Asp Trp Ser Ser Asp Ser Glu Gly Glu Arg Asp Gly 
260 265 270 

Ser Leu Trp Gin Asp Pro Pro Glu Arg Pro Asp Ser Pro Ser Thr Asn 

275 280 285 

Gly Ser Gly Phe Glu lie Leu Ser Pro Thr Ala Pro Ser Val Tyr Pro 
40 290 295 300 

His Ser Glu Gly Arg Lys Ser Arg Arg Pro Leu Thr Thr Phe Gly Ser 
305 310 315 320 

Gly Ser Pro Gly Arg Arg His Ser Gin Ala Ser Tyr Ser Ser Val Leu 
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325 330 335 

Trp 



5 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 226 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Arg Ala Gly Leu Val Phe Phe Val Gly Val Trp Val Val Ser Cys 

15 10 15 

Leu Ala Ala Ala Pro Arg Thr Ser Trp Lys Arg Val Thr Ser Gly Glu 
20 20 25 30 

Asp Val Val Leu Leu Pro Ala Pro Ala Gly Pro Glu Glu Arg Thr Arg 

35 40 45 

Ala His Lys Leu Leu Trp Ala Ala Glu Pro Leu Asp Ala Cys Gly Pro 
50 55 60 

25 Leu Arg Pro Ser Trp Val Trp Pro Pro Arg Arg Val Leu Glu Thr Val 
65 70 75 80 

Val Asp Ala Ala Cys Met Arg Ala Pro Glu Pro Leu Ala He Ala Tyr 

85 90 95 

Ser Pro Pro Phe Pro Ala Gly Asp Glu Gly Ser Glu Leu Ala Trp Arg 
30 100 105 110 

Asp Arg Val Ala Val Val Asn Glu Ser Leu Val He Tyr Gly Ala Leu 

115 120 125 

Glu Thr Asp Ser Gly Thr Leu Ser Val Val Gly Leu Ser Asp Glu Ala 
130 135 140 

35 Arg Gin Val Ala Ser Val Val Leu Val Val Glu Pro Ala Pro Val Pro 
145 150 155 160 

Thr Pro Thr Pro Asp Asp Tyr Asp Glu Glu Asp Asp Ala Gly Val Ser 

165 170 175 

Thr Pro Val Ser Val Pro Pro Pro Thr Pro Pro Arg Trp Ser Pro Arg 
40 180 185 190 

Gly Pro Pro Glu Ala Pro Ser Cys Tyr Pro Arg Gly Val Pro Arg Arg 

195 200 205 

Asn Gly Pro Tyr Gly Asp Pro Gly Gly His Tyr Val Cys Pro Arg Gly 
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210 215 220 

Asp Val 
225 

5 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide. 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Val Tyr Leu Trp Ala Arg Val Gly Gly Trp Leu Gly Tyr Leu Gly Gly 

1 5 10 .15 

Thr Trp Thr Pro His Lys Gly Ser Leu Glu Gly Gly Lys Leu Gly Gin 
20 20 25 30 

Phe lie Gly Arg Glu Arg Gly Ala Arg Thr Ala Val Pro Thr lie Ser 

35 40 45 

His Arg Ala His Ser His Leu Asp Pro Ser Asp Pro Gly Met Pro Gly 
50 55 60 

25 Arg Ser Leu Gin Gly Leu Ala He Leu Gly Leu Trp Val Cys Ala Thr 
65 70 75 80 

Gly Leu Val Val Arg Gly Pro Thr Val Ser Leu Val Ser Asp Ser Leu 

85 90 95 

Val Asp Ala Gly Ala Val Gly Pro Gin Gly Phe Val Glu Glu Asp Leu 
30 100 105 110 

Arg Val Phe Gly Glu Leu His Phe Val Gly Ala Gin Val Pro His Thr 

115 120 125 

Asn Tyr Tyr Asp Gly He He Glu Leu Phe His Tyr Pro Leu Gly Asn 
130 135 140 

35 His Cys Pro Arg Val Val His Val Val Thr Leu Thr Ala Cys Pro Arg 
145 150 155 160 

Arg Pro Ala Val Ala Phe Thr Leu Cys Arg Ser Thr His His Ala His 

165 170 175 

Ser Pro Ala Tyr Pro Thr Leu Glu Leu Gly Leu Ala Arg Gin Pro Leu 
40 180 185 190 

Leu Arg Val Arg Thr Ala Thr Arg Asp Tyr Ala Gly Val Leu Arg Val 

195 200 205 

Trp Val Gly Ser Ala Thr Asn Ala Ser Leu Phe Val Leu Gly Val Ser 
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210 215 220 

Ala Asn Gly Thr Phe Val Tyr Asn Gly Ser Asp Tyr Gly Ser Cys Asp 
225 230 235 240 

Pro Ala Gin Leu Pro Phe Ser Ala Pro Arg Leu Gly Pro Ser Ser Val 
5 245 250 255 

Tyr Thr Pro Gly Ala Ser Arg Pro Thr Pro Pro Arg Thr Thr Thr Ser 

260 265 270 

Pro Ser Ser Pro Arg Asp Pro Thr Pro Ala Pro Gly Asp Thr Gly Thr 
275 280 285 

10 Pro Ala Pro Ala Ser Gly Glu Arg Ala Pro Pro Asn Ser Thr Arg Ser 
290 295 300 

Ala Ser Glu Ser Arg His Arg Leu Thr Val Ala Gin Val He Gin He 
305 310 315 320 

Ala He Pro Ala Ser He He Ala Phe Val Phe Leu Gly Ser Cys He 
15 325 330 335 

Cys Phe He His Arg Cys Gin Arg Arg Tyr Arg Arg Pro Arg Gly Gin 

340 345 350 

He Tyr Asn Pro Gly Gly Val Ser Cys Ala Val Asn Glu Ala Ala Met 
355 360 365 

20 Ala Arg Leu Gly Ala Glu Leu Arg Ser His Pro Asn Thr Pro Pro Lys 
370 375 380 

Pro Arg Arg Arg Ser Ser Ser Ser Thr Thr Met Pro Ser Leu Thr Ser 
385 390 395 400 

He Ala Glu Glu Ser Glu Pro Gly Pro Val Val Leu Leu Ser Val Ser 
25 405 410 415 

Pro Arg Pro Arg Ser Gly Pro Thr Ala Pro Gin Glu Val 
420 425 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Val Cys He Ala Tyr His Gly Met Gly Arg Leu Thr Ser Gly Val Gly 

15 10 15 

Thr Ala Ala Leu Leu. Val Val Ala Val Gly Leu Arg Val Val Cys Ala 
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20 25 30 

Lys Tyr Ala Asp Pro Ser Leu Lys Met Ala Asp Pro Asn Arg Phe Arg 

35 40 45 

Gly Lys Asn Leu Pro Val Leu Asp Gin Leu Thr Asp Pro Pro Gly Val 
5 50 55 60 

Lys Arg Val Tyr His lie Gin Pro Ser Leu Glu Asp Pro Phe Gin Pro 
65 70 75 80 

Pro Ser He Pro He Thr Val Tyr Tyr Ala Val Leu Glu Arg Ala Cys 
85 90 95 

10 Arg Ser Val Leu Leu His Ala Pro Ser Glu Ala Pro Gin He Val Arg 
100 105 110 

Gly Ala Ser Asp Glu Ala Arg Lys His Thr Tyr Asn Leu Thr He Ala 

115 120 125 

Trp Tyr Arg Met Gly Asp Asn Cys Ala He Pro He Thr Val Met Glu 
15 130 135 140 

Tyr Thr Glu Cys Pro Tyr Asn Lys Ser Leu Gly Val Cys Pro He Arg 
145 150 155 160 

Thr Gin Pro Arg Trp Ser Tyr Tyr Asp Ser Phe Ser Ala Val Ser Glu 
165 170 175 

20 Asp Asn Leu Gly Phe Leu Met His Ala Pro Ala Phe Glu Thr Ala Gly 
180 185 190 

Thr Tyr Leu Arg Leu Val Lys He Asn Asp Trp Thr Glu He Thr Gin 

195 200 205 

Phe He His Arg Ala Arg Ala Ser Cys Lys Tyr Ala Leu Pro Leu Arg 
25 210 215 220 

He Pro Pro Ala Ala Cys Leu Thr Ser Lys Ala Tyr Gin Gin Gly Val 
225 230 235 240 

Thr Val Asp Ser He Gly Met Leu Pro Arg Phe He Pro Glu Asn Gin 
245 250 255 

30 Arg Thr Val Ala Lys Leu Lys He Ala Gly Trp His Gly Pro Lys Pro 
260 265 270 

Pro Tyr Thr Ser Thr Leu Leu Pro Pro Glu Leu Ser Asp Thr Thr Asn 

275 280 285 

Ala Thr Gin Pro Glu Leu Val Pro Glu Asp Pro Glu Asp Ser Ala Leu 
35 290 295 300 

Leu Glu Asp Pro Ala Gly Thr Val Ser Ser Gin He Pro Pro Asn Trp 
305 310 315 320 

His He Pro Ser He Gin Asp Val Ala Pro His His Ala Pro Ala Ala 
325 330 335 

40 Pro Ser Asn Pro Gly Leu He He Gly Ala Gly Ser Thr Leu Ala Val 
340 345 350 

Leu Val He Gly Gly He Ala Phe Trp Val Arg Arg Arg Ala Gin Met 
355 360 365 
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Ala Pro Lys Arg Leu Arg Leu Pro His lie Arg Asp Asp Asp Ala Pro 

370 375 380 

Pro Ser His Gin Pro Leu Phe Tyr 
385 390 

5 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 37 amino acids 

10 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : pept ide 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Val Gly Gly Leu Cys Leu Met lie Leu Gly Met Ala Cys Leu Leu Glu 
15 10 15 

20 Val Leu Arg Arg Leu Gly Arg Glu Leu Ala Arg Cys Cys Pro His Ala 
20 25 30 

Gly Gin Phe Ala Pro 
35 

25 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12489 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GAAAAGGGGG AAGGTGAGGG ATAGGGAAGG 


AAGAGGAAGG ATAAGAAGTG 


AAGAAAAAGG 


60 


GAAGAGAAAA GATAGATATG 


GGGAGAGGAG 


GAAGAGAGGG 


GGTGAAGAAG 


GGAGAAGAGG 


120 


GAGAGGAGAG GTAAAGAGGG 


GAGAGGAGGT 


AGGAGTGGAA 


GGGAAGAAGA 


GAGGAAAAGG 


180 


GGGGGAGGGA AGAGGGGAGG 


AGCGGCCGAA 


GCCGGAATGA 


CAAACAGACG 


AAGCGACTGG 


240 


GGGAGATCCC CCCGCCCCCG 


AGGACAGCTT 


TTCCGGGACC 


TATCCCCGCC 


ACCGCCGTAT 


300 


AAGCTCGTCT CCACGGTCGA 


TATCCCCCAC 


CCCGAGACAC 


CCCGGAGAAC 


ACCGAGCGGC 


360 


CGACAGGCCA CGGACCCCTA 


TTGCCGTCGA 


CACACCACCA 


GCAATCTCCG 


CGGATGTGCA 


420 


GCGACGNGAC CACACCGCCC 


GAAAACATCT 


GANTTCCCCT 


ATGACCTTTC 


CCACCACCCT 


480 
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CGCGCGGCGC AGGCGGCCAG CGCGAGCGCG 
TTTTCGGGCC GTGCGGCCCC GTCGCCGCAG 
GGTAACAGGG GTCCGTGGTC TGTTGGCCGC 
GGAGACTCGA CCGCGGAGAC GACGCAAAAG 
5 CTGTCCCCGG CGGCGGTTGT CCCGAGGCCC 
GCGCGGGGGT CAATCCACCA GTAGCGCGCA 
ACCCCCAGGG TGCCGACAAA CAGGGGCCGC 
TCGGCCAGCA CCAGCAGCGC CGCGTAGGCG 
CCCGGAACCC CCGGAGGCGC GCCGGCCTCC 

10 CCGCCGACCA GCCTCAGCCA CGCGCACACA 
GCGCGGAGGA ACCCCAGGCC GGTGCTCCCG 
TACGGGCCGC GCGACAGAAC GACAGACCCC 
AGATGCGCGA ACCCCCGAGC GATGGCACCC 
CCATCATCCC CTCCGAGGCA GTGTTCGCCG 

15 CGACACATCC CGCACCCGGG GCCCGACGGC 
CCCGCTCTTA TAAGCGCACG CAAAACAAAA 
AGGCGCAACC CGGAACCACA CACAACAGCC 
CCATTTATTT ATTTTTTTAA CACAACGCAC 
GCCACCACCC GTCGCCGTCA GGGCAACCCA 

20 AAGGCGGAGA GGGGGCTGGG GCTCGCGTCG 
TAGCGGCCGC CGTCCCCACC ACCCTCCGCC 
GGCGACTCGG CTCGCGTGGG GGCGGCGGGC 
TCCGCGGGGA CAACTTCGGC CCCGGCGTGC 
GCGGGCGGGA CCCGGGCCTG GCCCTCGAGG 

25 GCGTCCGGCT CGGAGGAGAA GTCCTGGCTG 
GCCAAGCTCC CGGTCGAGGA ACCCGGGGTC 
GGTTTTAAAA GAAACACCGC CGACACCGCG 
GGGACGTCGG CCGTGAGGAA GAAATTGAGG 
ACCGCCCGCA TGCTGCAGTC GTCGACGACC 

30 TAGACCGTCT TGGCGTTGGC GGCCGCCTGG 
TGGACCTGGG CGCTGGTGCT GGACGACGCG 
GTGCGCGCGT TAAACACGGA AAACTTGCCG 
GCGGTCTCGT CCCCGACGGC GTTCACCACC 
TGGACGTCGG GGTCGCCCTG GGGGAGTAGT 

35 AGCGTCTCGC TGGCAAGCTC CACGGCCTCT 
ACCAGCGTGC GAAACGGGGC CTGGCCCGTG 
TACTGGTTGG CGCGAAACAC GCTCAGCAGG 
GCCGCGGTGG GTCCGCCCCA TCGATAGCGA 
ACCTGCTCGC CGAAAATCGC GTTATGTACA 

40 GAGTCCAAAA GGCTCGTGCG AAGCGGCGCA 
CCCTGCAGCA CTATCTGGCA GGGCGGCCAG 
GCGTCCTCCG AAAGGGGGGC GGCGGCCGCA 
TCCCGGGACA ATCCGCAGAG ACGAGGCGCG 
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CCCGCAAAGG 


TCACCACGGG 


AACCCAGATG 


540 


ACCCGCGGAG 


ATCGCTGGTG 


GTGCGTCGAC 


600 


TCGGCGTCCT 


CCGGGGCGTA 


TCGGGGCGGG 


660 


CGGCGGTGGC 


GGGGACAGGC 


CTTGGAAAAG 


720 


GCCACCACCG 


CCGCCGTCAG 


GGCCGCGGCG 


780 


CTGCCGCCTA 


TCGTCAGCCC 


GCCGACGACA 


840 


GGGGCGAGGA 


GGAAAAGGGG 


ATGGACGTCG 


900 


GCCCCGAGCG 


CCAGGCACTG 


GGTCGCCGGG 


960 


CCGAGGCTCC 


ACAGGGCCAC 


GGCCGCTCCC 


1020 


CGCGTCGGCG 


GCCGGGCGAA 


CGCGGGGGGG 


1080 


ATCACGAACG 


CCCCCGACAT 


GGCGTCCGCA 


1140 


ACGAAGCCCA 


GCGTGGTGGC 


CTGGAGCCAG 


1200 


ACGCAGGCGA 


GGCGGGGGCG 


CCCGTCGTCC 


1260 


CTCCCGCCCG 


CGACCCCGGG 


GCTGTCCCCG 


1320 


GCTCTCGGTG 


AAGCTGCGAA 


GGGCCCCGGG 


1380 


AAGGAGGGGG 


AAGGGGGGTG 


GAAAGGACGG 


1440 


ATATTGGTTG 


GAGGGGGGGG 


ACACACTACC 


1500 


CCCGCGTGCC 


CGGGCGCGGT 


GAACCGTTCG 


1560 


AAACCGTATG 


GGGGTCTTTG 


GGGACCCCGG 


1620 


CCGGTCTGGA 


GGTCGCGAAA 


GTACCACGCG 


1680 


GCCTCGGGTC 


CGTATCTCGC 


GGAGAGGGGG 


1740 


ACGCCCGTCT 


TCGGGCGCTT 


GGTGGCGTCA 


1800 


CTCCGCTTGG 


TTCCCGGCGG 


TTCCGGGAAC 


1860 


CCTTCTTCTT 


CTGGGGCGCC 


GCGGTCGCCC 


1920 


TCGGTCGGGC 


CGGCCCGAGA 


GTCCTGGGAC 


1980 


CGGCCGAGCC 


AGTTCAGGCA 


GACCCGCTGG 


2040 


TTGGGGCCGG 


TGGCGGTGAC 


ACACACGCTG 


2100 


GTCCCCCCGC 


CGACCTGGAG 


CCGCCGGAGG 


2160 


ACCGAGAATG 


TGCGGTGTGT GTTTTCCCCG 


2220 


CCCGCCTTCT 


TCAGCGCGCT 


GGTCAGAATC 


2280 


CCCTCCTCGC 


GGGCGGCAAA 


GGTGACGCAG 


2340 


TTGGGGCCGA 


GCTCGAACGT GGTGGGTTTG 


2400 


TTCGTGAGCT 


GGGGCTTCGT 


GAGGCGCAGC 


2460 


ACCGCGAAGC 


TCGTCAACTC 


GCGTTTCATG 


2520 


CCGTCGGACG 


CGGTCGTCCA 


TATGCGCTGC 


2580 


ACCGTCAGCT 


CCACCCGCCG 


CAGGTCAGGG 




GATCGCTTCT 


GGTCCACGAG 


AGACAGGAAC 


2700 


CTGAACTGCG 


AATGGTCGAG 


GGGCAGAAAC 


2760 


AGGATGCCTC 


GGTCGCCCAC 


GACCAGGAGC 


2820 


AAGGCCTGCA 


GGATCCCGTT 


CAGCTCGGCG 


2880 


TCTTCCGTCC 


GCTCGCGCGG 


CGACGGGATC 


2940 


CCGCCGGGAA 


GATGAGCCAT 


GCCGCGACGC 


3000 


TCGTGTCACC 


GGGCCCGGAG 


GCGCGGCCGT 


3060 



WO 98/20016 



PCT/US97/20016 



TTGTGTCGCA GGCGGAGGGG GCGGATGACG CGGACCGGAT GGGGGTTAGG GGGGCCGGGG 3120 

GACCCGAGCC ACAGAGCAGT GGCTACCCGA GCCAAGGACT ACGGCGGACC CGCCGCCCTA 3180 

GTTTGGTTAA ATACGCCTTC CGCTAGTTAG GCCACACCCT CTTTGAGGGC TCGGGGGAGG 3240 

GGGAGGGGGG GAAGAGAGAG ATGGTCGGCC TGCACCGGCG CGCGCCGGCG GTTGCACCAA 3300 

5 TCCGCACGTA GATGGGAAAT AAAAAAGAAT TATAAAGAGC GTGCCTTTCC CGGGATAGCG 3360 

TCTTGTTGGA GCGGGGTCGT CGCCGCAGCC ACTGTACACA GGGGCGGCGG GCTTGGGTGT 3420 

CCCGGACCGT CACACCTATA CAGCTCTGTA GAGAGACCTA TCCGCACCTA CAATCGTGCC 3480 

GGAATGGGTC TGTTTGGCAT GATGAAGTTT GCCCAGACTC ACCATCTGGT GAAGCGCCGG 3540 

GGCCTCCGGG CCCCGGAGGG GTACTTTACC CCCATCGCCG TGGACCTGTG GAATGTCATG 3600 

10 TATACCCTGG TGGTTAAATA TCAGCGCCGC TACCCAAGTT ACGACCGCGA GGCAATCACG 3660 

CTACACTGTC TCTGTAGTAT GTTACGGGTG TTTACCCAAA AGTCCCTGTT CCCCATCTTC 3720 

GTGACCGATC GCGGGGTCGA GTGTACCGAG CCGGTTGTGT TCGGGGCCAA GGCGATCCTG 3780 

GCCCGCACGA CGGCCCAGTG CCGCACGGAC GAGGAGGCCA GTGACGTAGA CGCCTCGCCG 3840 

CCGCCTTTCC CCCATCACCG ACTCCAGGCC CAGTTTCCCC CTTTCCAACA TGCGCCGCCG 3900 

15 CGGGCACGCC TTCGCCCCGG GGGACCGGGG GAACGCGGGC CGCCGGCCCA GGCCCGGCGG 3960 

CCCCCTGGGG CGCGCCCTCG AAGCCGGCCC TGCGCCTGGC TCACCTGTTC TGTATCCGCG 4020 

TTCTGCGGGC GCTGGGGTAC GCCTACATCA ACTCGGGTCA GCTGGAGGCC GACGACGCCT 4080 

GCGCGAACCT CTATCATACC AACACGGTCG CGTACGTGCA TACCACGGAT ACCGATCTCC 4140 

TGCTGATGGG CTGCGATATC GTGTTGGACA TCAGCACCGG CTACATTCCG ACGATTCACT 4200 

20 GCCGCGACCT GCTGCAGTAC TTCAAGATGA GTTACCCGCA GTTCCTGGCG CTGTTCGTCC 4260 

GCTGCCACAC AGACCTGCAC CCCAATAACA CCTACGCGTC CGTCGAGGAC GTGCTGCGCG 4320 

AGTGTCACTG GACCGCCCCG AGCCGATCCC AGGCCCGCCG GGGGGCCCGG CGGGAGCGCG 4380 

CCAACTCGCG CTCCCTGGAG AGCATGCCTA CGCTGACCGC GGCCCCGGTC GGCCTCGAGA 4440 

CGCGCATCTC GTGGACCGAA ATTCTGGCCC AACAGATCGC GGGCGAGGAC GACTACGAAG 4500 

25 AAGACCCCCC CCTCCAGCCC CCGGACGTCG CCGGTGGGCC GCGCGACGGC GCCCGGTCGT 4560 

CCTCCTCGGA GATACTCACC CCGCCCGAGC TCGTGCAGGT CCCCAACGCG CAGCGGGTCG 4620 

CGGAACACCG CGGCTATGTC GCCGGACGTC GCCGCCACGT CATCCACGAC GCCCCGGAGG 4680 

CCCTGGACTG GCTGCCCGAT CCGATGACCA TCGCCGAGCT GGTGGAGCAC AGATACGTCA 4740 

AGTACGTCAT ATCGCTTATC AGCCCCAAGG AGCGGGGACC CTGGACTCTT CTAAAAAGAC 4800 

30 TGCCCATCTA TCAGGACCTC CGCGACGAAG ATTTAGCGCG CTCCATCGTG ACTCGGCATA 4860 

TCACCGCCCC GGACATCGCC GACCGGTTTC TGGCGCAGCT GTGGGCCCAC GCGCCCCCGC 4920 

CCGCGTTTTA CAAGGACGTC CTGGCTAAAT TCTGGGACGA GTAGCCGGAA CGGAGGAAAC 4980 

GCGCGCCCCC ATCCCCTCCC GATGCCCGAC CTGTTAATAA TAAGAGTAAT AAAATCGTTT 5040 

GTTATTATGC ATCTCGGGGT TCTGGTCGGC GCTTGATTTA TCGGTTGGAC GCGTTTCCCT 5100 

35 TTTGGTCCTT TTCTCTGGTT TCGGGCGTTC CTTCCCTTTC CCCAGCCGCC ACCCCCCTCC 5160 

CCTGCGTAAT AATCACACCG GAGACCCAAC AGTCCGTTTC GACCCCTTTA TTTCGGTTAG 5220 

ACATCGCTAC AAGGGCGCCC AGACCCTCAC AGATCGTTGA CGACGGCCCC GGCGTACGAG 5280 

GTGCTGCGGC ACTCGAAGAA GTTGGTGTGT TTGTCGGTGG ACATGAGGCT GAGGGGAAAG 5340 

CTGGCGTCGG GGGCGGGGGC GGAATACAGG GGCTGCATAT GGATCAGGCC CAGCAGGCGA 5400 

40 TCCGCGCTGA ATCGCACGTA GTTCTCGATG GCCGCCAGGG CCCCCGGACT CAGGATAGAG 5460 

CTGTCCGTCG GGGCCTGGGA TCGGATGAAC CCGATCTCGA TATCCACCGC CTCCCGAAAC 5520 

AGCCGGTACA CGCGCGCCGC CTCGGGCTTG GCGTGGCCCC CGAGGTAGTT GTTGTAGATG 5580 

TAGCACGAGG CTGTCGTATG CACGGCCTCG TCGCGGCTGA TGAGGTCGTT CGACTGGCAG 5640 
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GTGACCCGCA GGAGGTTGTT GGTGCGCAGG TACGCGATGG 
AAGACGCCCT CGATGAGGAT CATGAGGATG AACTTCTCCG 
ACCCGCGCCT CCAGCCAGTC CACCTTGACG CGAATGGCCG 
ACATAGGCGC GGCGCGCCTG GTCGTTGTTG TGAAAGAGCA 
5 ACGCGCGAGT GGACGACCTC GATGCATTCC TGCTCCACGT 
TGTTCGAAGA GGCCGGAGAG GCCGCCCAGG TTTTCCGTCA 
AGGAAGGCAA ACAGAAAGCG GTAGAAGCCG AGCTCGCCCT 
TCCTCGTCCC CCACGAACAC GAGCTCGGTC TCCAGCCAGC 
CGAAGGTGGT TGATGTCGGG GCACTGGGAG GTGTAGAAGT 

10 ACCGGAATCG GGGCCGCCCC GGCCCCCGAC GCGTGGGTAT 
GGGGAGACGG CGGGATCCAT GGCGATATGC GGGACCGAGA 
GAGCGCTGTT GCTTACAGCG CGCAGCTTGT GCAGACGATG 
CCCGCTGTTG GTCGCCTTGC GAACCTTGCA GTAGTACATC 
ATATGCGTGG ACGAGAAGGC GGACCAGGGT GGAGGCGGGG 

15 TGTGACATAC AGAGTCATGG ATTGGCTGTG ATCAACATAG 
GTCGATCAGC AGTTCCTGGT CGTAGTCGAA GGCCGTCTTG 
GGGGTCCAGG CAAGGCAGGG CCTGGGCCAC AGACCACTGC 
CGCGTCCAGG AGCCGCTTCC CGCCGAACGT GCGCTCGAGT 
GGGGCGCAGC GTCTCGCCGT CCCTGGTCAC CTTGCTGAAC 

20 AAAGCCCTCG CTGACGTCCG AGATCTGGGC CGAGGCGGCG 
CTGGCTGTTG CGCAGGCCGT GTTTCATCAT GCTCTGGCGT 
GTACCGCGGG CTGGCGTTCG AAAAGCGCTC CCAGTGAAAG 
GCGCTTAAAG TGGCTGAAGG GACGCGCCCC GCGAACGCAC 
GGCCGCGAGC AGCATCACCT CGGCGATGTG TGTGTTCAGG 

25 CAGATCCAGG CCCATCTTCA GGCACGCCGT GTGCAGGCCC 
CCGCAGGTTG TCGTGGCCGC GGGCGCACTG GGGCGTCGGC 
GATATTAACC ATTAGCACGC ACGCCTGCAC GGCGTCGCGG 
CCGCCGGGAG ACGCATCGGG CCAGATTCAC GCTGCCCAGG 
TTTGGAGGAC GGGTGGACGA TTTCCGTGCA GAGGTTGGAG 

30 CGTGTTGTAG ATGTAGTGGC GGTTTACCGC GTCCTTAAAC 
GGTGGCCGCG CTGCGCACGA TGGCGTACGC CAGGTCCTGG 
CCCCATGGCC TCGAGGTGCT CGTACAGCTT CTCGAACTCC 
CGACATGCTG GTGTCCCGGT CGAACAGGGA CCAGGTGACG 
GCGGATCAGG CGCTTGAAGA ACAGGTCCGG CATCCAGAGG 

35 GCGCTGGGCC TCCTCGCCGG CGAGGACGCC CTTCATTCTG 
GTGCCAGGGT TCCAGGTACA CGCACGCCCC GGTGGGGCGC 
.CGCCACCAGG GAGTCCAGGA CCTTCAGGGC CGGCATGATG 
GTCGTTGAAC GCCTGCATGC ACAGCCCGAT GCCCCCGTTG 
GTTGCCGGTG ATGGCCCGGA GGGTGGCCTG GTTAGTGGTG 

40 GCAGCTGGAC GTGTAGTAGT TGCGGGTTCC GAGGTTCAGC 
GATCTGGTGG TCGTAGAGGC GGTGGAAAAA GAACTTGAAC 
TCGCCCCAGG GCGATGTGGC GCATGCCGCG GGTCGCCCGG 
GCGGGTGTAC ATCTGGAAGA CGGACTCCAT GTAGTGCCCG 
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CGGCGAACGA 
GGATCGAGTC 
GGTGGTTGAT 
CCAGCTGGAT 
AGTAGTGAAG 
CCAGGTCGTC 
CGGAGAGCTT 
GGTTCAGGAT 
ACCGCTCGGG 
CTAGGGGGTC 
GCGACGCCTG 
TTGTCGTCGC 
CCCGTCTTCA 
AGCGTCCCGT 
GGGGCGCGGT 
AACCGCCGGA 
TTGGCCTCGA 
TCCTTCAGCA 
AGGTTGGTGA 
GTGGGCATGA 
AGCATCTCCC 
CGGCCGGCCC 
AGCGCGTTAC 
TCCCGGAACT 
TGCATGCCAA 
TGCAGCGTGC 
AGCATGCCAA 
TTGCAGACCC 
CCGGCAATGG 
ATGATGAAGG 
. ATGGGGATCG 
TCGCCGTGAA 
TTTTCCTCGC 
GCGCTGAAGA 
AGCACGGCCC 
GTGCTCTGTT 
CTGGCGGTGC 
CGGGCGAGGA 
GCCTGGGGGT 
ATGGCGGGGG 
ATTTCCCACC 
CACGCCAGGA 
CCGAAGCGCT 



GGCGGCAAAA 
GCATTCCCGC 
GGTGCGGGCC 
GATGTTGTAG 
AATGTCCTTC 
CGCGGCCGAC 
GGAGACGTCC 
GCTGAGGGAG 
GGTGGGGCAC 
GGTGCTCGCG 
ACCCCGATCG 
CGGCGAACAC 
GGCCGCGCTT 
CCGCCTTCTC 
CTGCACACAG 
GGGGGTGGGC 
GCCCGTCCAT 
AGAGCGTGTT 
ACAGGGGGGC 
GCGCGATGAA 
ACTCGCCCTC 
GGTACATGCT 
TGGTCTTCAT 
CGGCCGACTC 
TGCCCATGGA 
TGTCTATCAT 
AATCGAACGT 
CGCTGGAGCG 
CCGCCCCTTG 
GGCTTCCGGT 
TTTCGCCGAA 
AGTCGGCGAG 
CGTCGAGGTG 
TGTTGTCGCA 
GAACGTCGCT 
TGTTGTGCGC 
CGGGGCTGGC 
TGGCGCTCAC 
TTACCAGGTA 
TGGACGGCAC 
ACGACCCCTG 
ACCCGGCGAT 
TGAGGTAAAA 



5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 
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35 



40 



CTCTTCGTAC 
GATCAGACAG 
GCGAAGCCTT 
AAAGGAGGCC 
GCGGGCGGAC 
CAGATGATAG 
CATCTCAGCG 
GCCGAAGGTT 
CAGCATGAGC 
GTTCTCGGGC 
GCTGTCCAGA 
GTCGGCCGAC 
GTTTCCGGGG 
GCTCCATCTG 
GTCGCTGTCA 
CTCCGAACCC 
CGACGGGCCG 
ATCGCGACGG 
AGCCTGGGGT 
CTGGGTCCCC 
GGTTCGGCCA 
AGCGCCCTCG 
GTTGGAGCCG 
ATCGCTGGAA 
GGGGGCGACC 
GGCGGATGCG 
AACGGAAACA 
CGGCCGACGA 
AGGAACGCAA 
CAAGTCCCCG 
CCGGCACACA 
CTCCTGAGTC 
TTTCAGAGAC 
ACGCACGGGG 
GGGGGAAAGT 
GCGCAGGCCC 
CCGACCGCCC 
CCCAGCTCTG 
GGCTGCCAGC 
CGCGGGGATT 
GGAAGGGGCG 
AGCCGCTCGA 
CAACAGAGCT 



TTCAGCGCCG 
TCGTAGGGGT 
TCCGTCAGCC 
TCCCGGGTAC 
CGCCGCACCA 
GGCGTGTATG 
AGCGCGTAGT 
CGTGGGGGCA 
GCGGGCTCGC 
GTGAGTTCTA 
ACCGGCGCCA 
GCGCGCGGGT 
GAGTCCCCGG 
CGACGAACGA 
TCGTCGTCAG 
GAGCCCGTAT 
TCTGACCATG 
GCGCAGCACT 
CCTTGGGGTT 
ACGGATGTAG 
TCCCCGCCGG 
AGGTCACGCA 
CATTGAACAA 
AGCACCATCA 
TCGGGCTCCC 
GCAGGGCGGT 
GCGGCGGGCC 
ATCACCCGGG 
CAGGAATCGG 
AAGCACCCAC 
GGGGTGGGGT 
AGGGGGGTGT 
ACACGCCAAA 
CCGCGCACGG 
GGGAGGGGAA 
GCCACCCGCC 
CAAACCGCTC 
CGAAGGCGGC 
GGTACAGCCT 
GCGAGCTCGC 
TCATGTCCTC 
GGGCCGCCTC 
CCTGGTCCCG 



ACTGCAGCCC CCGCTCGACG 
TCAGGGCCTG . GGCCAGGATC 
CGAAGTCCAG GTCCACCTCC 
GGATGCGCAG GTGAACCAGA 
GGGGTTTGAA CCCGTTAACC 
CGTTGGGGGG GACCGGGGGA 
TCAGGAGCCC AAAGTCGTCC 
CGCGCTTGCT CTCCTCGCGG 
GGTCGACGGC GTCCCCCAGA 
GGGGGACTGG GTAGCCGGGG 
CTTCCGCCTG GGGTGCGGCG 
CCGTCGCGGA GCCCGGCCCG 
GGCGCCGGGG CTTGGGAAAG 
CAACGTCGGG CTGCACGGAG 
TCGCCCCTGC GGCCCAGATC 
CCTCGTCCGA GGAGTCCGAG 
ACTCCGCGGC CCCGACGTCC 
CGTGGCCCCA TGGAAAGGGG 
CGGGGGTCCT TGGGTTCCCG 
TCGCGGACGG GCCCGAGGTT 
CTGCGACGTT TGAGATCGCG 
AATGACCGCG CGCCACGTCT 
AGCTGCTGTC GCTAATGCGG 
CGCCGCTGAC TTTCCTGCAA 
GGGG TTCCTG TCGTTCGGAC 
TGGCCATTGG AACCAAGGTG 
CGTGGGAGGC GGGGTGGGCG 
CAGTCGCCAA CAATCTGTCG 
TGGGCGAAAA CGTGAGGGCT 
CTGTTGTGGT AACCCAAACC 
GCGATCGTGT GCTGAGTCAA 
AACGTGTCCA TGAATCCCAT 
CACACCACGT CACGGATTTA 
CCAAGACGGC GAGGGCGGAG 
GGTCGCTCGG GTTGGGCGGC 
GGGGCGCCAC ACCACGCCCT 
GGTGCGGTGC AGACAGCGGG 
GTATGTGCAG GTGCGTGCGG 
GTCGACAAAC TGGGGGTTCG 
GGCCAGGACG GCGAGCGGAA 
GGCCCCGTGA ATCGTGTTGG 
CAGTCCGCGC GGGGGGAGCT 
GCACTGTCGG TAGGTGAAAA 

191 



AGCGTGTTCG 
ATTAGCTGGG 
TTGGAGCGCA 
ATCCCCAGGA 
AGCCGCGTCG 
AGGTCCAGGC 
TCCGTGAGGC 
GCGCACCGAC 
AACCGCGCCA 
TCCGTCCCCA 
GCGTGGGCCG 
GTGCCGGCGC 
GCCACGGGGG 
TCGTCCGACC 
GAAGAGGATC 
TCCTCCGTTT 
TTCTCGGCGC 
GGAGGAGGGG 
TGGAGGAACT 
CCGCCGAGCG 
ACGAAGGCGC 
CCGTCGATTA 
TAGGCCGCGG 
AACACGTGGT 
GGAGACCGCG 
ACGGTGGCTC 
GTACGACCGA 
ACAGGACAGC 
CGCAGGAGAG 
GCCCGCGTTG 
TCTAAGGAGG 
TTGCATGTCG 
ACCTGGCTTT 
TTGGTGTGTA 
CGTCGTCAAT 
CCAGGATGAC 
GGCTGTTGTC 
TAGGGCGCCC 
TGACTTGGCT 
AGTCAACGTT 
TTATCCGGAG 
GGCTCTTGAC 
CCAAGTACAG 



GGGTGCTGTG 

CCTCGTGTTC 

TCCATTCCTC 

TGCGATACAG 

CATACTCCCT 

ACAGGCGTCG 

GGGGGGCGCT 

AGAAGTACTC 

CCGCCTCCGC 

CAGTGGGCTG 

CGGAATCGGA 

CCAGGCCGGG 

CAGGGCCGTC 

GCGAGTCGGA 

GAGACAGCGT 

CGGAGTCGGA 

CGCCCCTGGC 

GCGGGGGGAC 

CCCCGGACGT 

CCACGACGGC 

CGGTGGACGT 

TCATACTGCA 

GGCCGGGGGG 

CGCCGCCAGG 

CTCCGGCGAG 

GGACGCGATG 

AAGGACGCAC 

ACCGGCCGAC 

TGGCGACCGT 
GCTTTTATCC 

ATATGCCTAT 
GTTTCCGCCA 
ATTGAGACGG 
GGAGGGGGGG 
AGACGATCAC 
AACCGGAACG 
TGGCATGACA 
CCGCAGATCC 
CGCCGAGCAC 
TCGGTTCGGG 
GCGCCCGAAC 
CACGTACACG 
GAACGCCGCC 



8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10?60 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
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25 



CCAGGGCCCC 
ACGGCCGAAC 
ACGGCGGTGA 
GCATGACCTC 
GGTCCAGCCG 
CCTCGGCGGC 
ACACAAGGCC 
CCAGGAGCTG 
GGCGGGCCAG 
GCTCCGCGTT 
GGGCGCCGGG 
CCTGCTCGGT 
CCAGCAGCCA 
CGTCGCGGCG 
CGCGCGGGCT 
GTGGCTTGGT 
TGGTGGAGGT 
CCCCCCCTCC 
GTCCCTCTGC 
TATTCGTCTC 
CATCCTATCT 
CCCCTCTCTC 
TCCTCGTACT 
CCTATCTTTC 
CTCTCCTCCT 
CCTTGTTTAC 
TGACCCCCCC 
CCCTCCTCTT 
CTTCTCTT 



CCAGGCACAG 
AGGGCCTGCC 
CGCTAGCCAG 
GTGGGGAAAC 
CGCCCCGGCC 
GTCGCGCGCA 
CGCGCGCGCG 
GCCCCAGGCC 
GTGGGGGAGG 
TGGTTGGCAG 
GACGGTCAGC 
ATCGTCTCGG 
CATCGTACTG 
AGCGATGGGC 
GGTGGTGGTT 
TTTCATGGTT 
GGGCCACCCC 
TCGTGTCCTC 
CCCTCCCTCC 
TTGGCCTCTC 
CCTCTCCTTC 
TCTCTGTCCA 
CTCTACTCCT 
CACTTCTCCC 
TCCGTCTATC 
ATCACTTTTG 
CCCTCTCCCT 
CGTTCTCCTC 



CTCGGCGTCC 
CGGGTGGCCG 
CTCGTCCTGT 
ACGTGCGTGT 
CGCGTCCCGC 
TCGTAGGCGG 
CAGCCGCTCT 
TCGGCCAGTC 
TCGGTGGGGT 
AGATCCGTCA 
CCTCCCGCGC 
GTAGCGCGCG 
GCCGTCCTCC 
GGGCGCAGGA 
TCCACGGCAC 
TTCCCGCCGA 
CCNGGCCGCC 
TACCCATCTC 
CTCCTCCCCG 
GCTTATCCCC 
TTACCATGAC 
TGCTCTCCCC 
CCCTATCATG 
CCCCGCTCTA 
TCTCTCCTCC 
AAACTCCAGC 
TTCTCTCGTA 
CCTTCTTTTC 



AGGTCCACAA 
GTGTGTGTGG 
GTGACCACGA 
GGACCATGGC 
CGTAGTTGGT 
CGGCGCACGC 
CGGCCCGCCC 
GCTCGGTCTG 
GCCGCAGGGC 
GGGTTACCTG 
GCCGGGCTCC 
TTCCCGGAGA 
TGGGGGCCCT 
CGCGCCGGAG 
TCTCGGCCCA 
TCGCCAACGG 
CGTANCCCTC 
TCTTCCCATT 
TTTCCTCTCA 
ATATCTCCCC 
AATTCTCTCC 
TCGTCTATCT 
TCATGCGTGC 
CTCTTCCTCT 
CCTCTTTTCT 
TCTTGCTCTC 
TCTCAGTCCT 
GATCCCGCTC 



AGGCGCAGGC 
CCTCCTGGGG 
CACTAGCCCC 
GCGCAGGCAT 
CGTGATGTGG 
GGCCACCAGA 
GGACCCCAGG 
CCGGCCGGGC 
CAAAAGGAGC 
GCGGGTCAGG 
CCTGAGGATC 
CGACTCCGCG 
GTCCCCCAAA 
CGGGGCGTGG 
CGCCATCGGG 
GCGAGTTGTG 
CCCCCCCCTT 
CCCCTCCCCC 
TTCCCTCCTT 
TCCTGTGGTC 
TCATCTCTCC 
ACCTCCACAT 
CCTCGCGTAC 
CATGCGTCGC 
TCACCCCTCG 
CCCCACTGCT 
CCTCTCTCCC 
CTCTGTCTCC 



CGGGATGGTA 
TCCGCTGCAG 
CCGAAAAACC 
TCGGAAAACC 
GCCCGGACCG 
AAGTTAAACG 
GCGGAGGCCT 
GGAGCCCGAT 
GCCCCGGCCC 
TGATAGCGGG 
TTGTCCACGG 
GGGTCGATCC 
AGCACCGGGC 
CCCGCGAGCT 
GCTGTCGGGA 
GGGGGGGGGG 
CTGCCCTCCC 
CTCTCCCTCC 
ATTCTCCCAC 
CTATTATTCT 
TCTCCTTACT 
CTCCATTCTC 
TCTTCTCTCC 
TCTATCTTCT 
GTGACCCCCT 
CCTCCTCCCC 
CCTCTTTTTT 
TCCTCCTTCT 



10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12489 
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(2) INFORMATION FOR SEQ ID NO; 18: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



Val Cys Pro Pro Pro Pro Thr Asn Met Ala Val Val Cys Gly Ser Gly 
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1 



5 



10 



15 



Leu Arg Leu Arg Pro Phe His Pro Pro Ser Pro Ser Phe Phe Val Leu 

20 25 30 

Arg Ala Leu lie Arg Ala Gly Pro Gly Pro Phe Ala Asp Arg Ala Pro 

35 40 45 

Ser Gly Pro Gly Cys Gly Met Cys Arg Gly Asp Ser Pro Gly Val Ala 

50 55 60 

Gly Gly Ser Gly Glu His Cys Leu Gly Gly Asp Asp Gly Asp Asp Gly 



10 Arg Pro Arg Leu Ala Cys Val Gly Ala lie Arg Phe Ala His Leu Trp 

85 90 95 

Leu Gin Ala Thr Thr Leu Gly Phe Val Gly Ser Val Val Leu Ser Arg 

100 105 110 

Gly Pro Tyr Ala Asp Ala Met Ser Gly Ala Phe Val lie Gly Ser Thr 
15 115 120 125 

Gly Leu Gly Phe Leu Arg Ala Pro Pro Ala Phe Ala Arg Pro Pro Thr 

130 135 140 

Arg Val Cys Ala Trp Leu Arg Leu Val Gly Gly Gly Ala Ala Val Trp 
145 150 155 160 

20 Ser Leu Gly Glu Ala Gly Ala Pro Pro Gly Val Pro Gly Pro Ala Thr 

165 170 175 

Gin Cys Leu Ala Leu Gly Ala Ala Tyr Ala Ala Leu Leu Val Leu Ala 

180 185 190 

Asp Asp Val His Pro Leu Phe Leu Leu Ala Pro Arg Pro Leu Phe Val 
25 195 200 205 

Gly Thr Leu Gly Val Val Val Gly Gly Leu Thr lie Gly Gly Ser Ala 

210 215 220 

Arg Tyr Trp Trp He Asp Pro Arg Ala Ala Ala Ala Leu Thr Ala Ala 
225 230 235 240 

30 Val Val Ala Gly Leu Gly Thr Thr Ala Ala Gly Asp Ser Phe Ser Lys 

245 250 255 

Ala Cys Pro Arg His Arg Arg Phe Cys Val Val Ser Ala Val Glu Ser 

260 265 270 

Pro Pro Pro Arg Tyr Ala Pro Glu Asp Ala Glu Arg Pro Thr Asp His 
35 275 280 285 

Gly Pro Leu Leu Pro Ser Thr His His Gin Arg Ser Pro Arg Val Cys 

290 295 300 

Gly Asp Gly Ala Ala Arg Pro Glu Asn He Trp Val Pro Val Val Thr 
305 310 315 320 

40 Phe Ala Gly Ala Leu Ala Ala Cys Ala Arg Trp Trp Glu Arg Ser 



65 



70 



75 



80 



325 



330 



335 



(2) INFORMATION FOR SEQ ID NO: 19: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 466 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Ala His Leu Pro Gly Gly Ala Ala Ala Ala Pro Leu Ser Glu Asp 

15 10 15 

Ala lie Pro Ser Pro Arg Glu Arg Thr Glu Asp Trp Pro Pro Cys Gin 
15 20 25 30 

He Val Leu Gin Gly Ala Glu Leu Asn Gly He Leu Gin Ala Phe Ala 

35 40 45 

Pro Leu Arg Thr Ser Leu Leu Asp Ser Leu Leu Val Val Gly Asp Arg 
50 55 60 

20 Gly He Leu Val His Asn Ala He Phe Gly Glu Gin Val Phe Leu Pro 
65 70 75 80 

Leu Asp His Ser Gin Phe Ser Arg Tyr Arg Trp Gly Gly Pro Thr Ala 

85 90 95 

Ala Phe Leu Ser Leu Val Asp Gin Lys Arg Ser Leu Leu Ser Val Phe 
25 100 105 110 

Arg Ala Asn Gin Tyr Pro Asp Leu Arg Arg Val Glu Leu Thr Val Thr 

115 120 125 

Gly Gin Ala Pro Phe Arg Thr Leu Val Gin Arg He Trp Thr Thr Ala 
130 135 140 

30 Ser Asp Gly Glu Ala Val Glu Leu Ala Ser Glu Thr Leu Met Lys Arg 
145 150 155 160 

Glu Leu Thr Ser Phe Ala Val Leu Leu Pro Gin Gly Asp Pro Asp Val 

165 -* 170 175 

Gin Leu Arg Leu Thr Lys Pro Gin Leu Thr Lys Val Val Asn Ala Val 
35 180 185 190 

Gly Asp Glu Thr Ala Lys Pro Thr Thr Phe Glu Leu Gly Pro Asn Gly 

195 200 205 

Lys Phe Ser Val Phe Asn Ala Arg Thr Cys Val Thr Phe Ala Ala Arg 
210 215 220 

40 Glu Glu Gly Ala Ser Ser Ser Thr Ser Ala Gin Val Gin He Leu Thr 
225 230 235 240 

Ser Ala Leu Lys Lys Ala Gly Gin Ala Ala Ala Asn Ala Lys Thr Val 
245 250 255 
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Tyr Gly Glu Asn Thr Thr Phe Ser Val Val Val Asp Asp Cys Ser Met 

260 265 270 

Arg Ala Val Leu Arg Arg Leu Gin Val Gly Gly Gly Thr Leu Asn Phe 
275 280 285 

5 Phe Leu Thr Ala Asp Val Pro Ser Val Cys Val Thr Ala Thr Gly Pro 
290 295 300 

Asn Ala Val Ser Ala Val Phe Leu Leu Lys Pro Gin Arg Val Cys Leu 
305 310 315 320 

Asn Trp Leu Gly Arg Thr Pro Gly Ser Ser Thr Gly Ser Leu Ala Ser 
10 325 330 335 

Gin Asp Ser Arg Ala Gly Pro Thr Asp Ser Gin Asp Phe Ser Ser Glu 

340 345 350 

Pro Asp Ala Gly Asp Arg Gly Ala Pro Glu Glu Glu Gly Leu Glu Gly 
355 360 365 

15 Gin Ala Arg Val Pro Pro Ala Phe Pro Glu Pro Pro Gly Thr Lys Arg 
370 375 380 

Arg His Ala Gly Ala Glu Val Val Pro Ala Asp Asp Ala Thr Lys Arg 
385 390 395 400 

Pro Lys Thr Gly Val Pro Ala Ala Pro Thr Arg Ala Glu Ser Pro Pro 
20 405 410 415 

Leu Ser Ala Arg Tyr Gly Pro Glu Ala Ala Glu Gly Gly Gly Asp Gly 

420 425 430 

Gly Arg Tyr Ala Trp Tyr Phe Arg Asp Leu Gin Thr Gly Asp Asp Ser 
435 440 445 

25 Pro Leu Ser Ala Phe Arg Gly Pro Gin Arg Pro Pro Tyr Gly Phe Gly 
450 455 460 

Leu Pro 
465 

30 (2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



Met 
1 



Gly Leu Phe Gly Met Met Lys Phe Ala Gin Thr His His Leu Val 
5 10 15 
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Lys Arg Arg Gly Leu Arg Ala Pro Glu Gly Tyr Phe Thr Pro lie Ala 

20 25 30 

Val Asp Leu Trp Asn Val Met Tyr Thr Leu Val Val Lys Tyr Gin Arg 
35 40 45 

5 Arg Tyr Pro Ser Tyr Asp Arg Glu Ala lie Thr Leu His Cys Leu Cys 
50 55 60 

Ser Met Leu Arg Val Phe Thr Gin Lys Ser Leu Phe Pro lie Phe Val 
65 70 75 80 

Thr Asp Arg Gly Val Glu Cys Thr Glu Pro Val Val Phe Gly Ala Lys 
10 85 90 95 

Ala lie Leu Ala Arg Thr Thr Ala Gin Cys Arg Thr Asp Glu Glu Ala 

100 105 110 

Ser Asp Val Asp Asp Pro Pro Phe Pro His His Arg Leu Gin Ala Gin 
115 120 125 

15 Phe Pro Pro Phe Gin His Ala Pro Pro Arg Ala Arg Leu Arg Pro Gly 
130 135 140 

Gly Pro Gly Glu Arg Gly Pro Pro Ala Gin Ala Arg Arg Pro Pro Gly 
145 150 155 160 

Ala Arg Pro Arg Ser Arg Pro Cys Ala Trp Leu Thr Cys Ser Val Ser 
20 165 170 175 

Ala Phe Cys Gly Arg Trp Gly Thr Pro Thr Ser Thr Arg Val Ser Trp 

180 185 190 

Arg Pro Thr Thr Pro Ala Arg Thr Ser lie lie Pro Thr Arg Ser Arg 
195 200 205 

25 Thr Cys lie Pro Arg lie Pro lie Ser Cys 
210 215 

(2) INFORMATION FOR SEQ ID NO: 21: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

40 Val His Thr Thr Asp Thr Asp Leu Leu Leu Met Gly Cys Asp lie Val 
1 5 10 15 

Leu Asp lie Ser Thr Gly Tyr He Pro Thr He His Cys Arg Asp Leu 
20 25 . 30 
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Leu Gin Tyr Phe Lys Met Ser Tyr Pro Gin Phe Leu Ala Leu Phe Val 

35 40 45 

Arg Cys His Thr Asp Leu His Pro Asn Asn Thr Tyr Ala Ser Val Glu 
50 55 60 

5 Asp Val Leu Arg Glu Cys His Trp Thr Ala Pro Ser Arg Ser Gin Ala 
65 70 75 80 

Arg Arg Gly Ala Arg Arg Glu Arg Ala Asn Ser Arg Ser Leu Glu Ser 

85 90 95 

Met Pro Thr Leu Thr Ala Ala Pro Val Gly Leu Glu Thr Arg He Ser 
10 100 105 110 

Trp Thr Glu He Leu Ala Gin Gin He Ala Gly Glu Asp Asp Tyr Glu 

115 120 125 

Glu Asp Pro Pro Leu Gin Pro Pro Asp Val Ala Gly Gly Pro Arg Asp 
130 135 ' 140 

15 Gly Ala Arg Ser Ser Ser Ser Glu He Leu Thr Pro Pro Glu Leu Val 
145 150 155 160 

Gin Val Pro Asn Ala Gin Arg Val Ala Glu His Arg Gly Tyr Val Ala 

165 170 175 

Gly Arg Arg Arg His Val He His Asp Ala Pro Glu Ala Leu Asp Trp 
20 180 185 190 

Leu Pro Asp Pro Met Thr He Ala Glu Leu Val Glu His Arg Tyr Val 

195 200 205 

Lys Tyr Val He Ser Leu He Ser Pro Lys Glu Arg Gly Pro Trp Thr 
210 215 220 

25 Leu Leu Lys Arg Leu Pro He Tyr Gin Asp Leu Arg Asp Glu Asp Leu 
225 230 235 240 

Ala Arg Ser lie Val Thr Arg His He Thr Ala Pro Asp lie Ala Asp 

245 250 255 

Arg Phe Leu Ala Gin Leu Trp Ala His Ala Pro Pro Pro Ala Phe Tyr 
30 260 265 270 

Lys Asp Val Leu Ala Lys Phe Trp Asp Glu 
275 280 



(2) INFORMATION FOR SEQ ID NO: 22: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 528 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Arg Ala Gly Leu Val Phe Phe Val Gly Val Trp Val Val Ser Cys 
15 10 15 

5 Leu Ala Ala Ala Pro Arg Thr Ser Trp Lys Arg Val Thr Ser Gly Glu 
20 25 30 

Asp Val Val Leu Leu Pro Ala Pro Ala Gly Pro Glu Glu Arg Thr Arg 

35 40 45 

Ala His Lys Leu Leu Trp Ala Ala Glu Pro Leu Asp Ala Cys Gly Pro 
10 50 55 60 

Leu Arg Pro Ser Trp Val Trp Pro Pro Arg Arg Val Leu Glu Thr Val 
65 70 75 80 

Val Asp Ala Ala Cys Met Arg Ala Pro Glu Pro Leu Ala lie Ala Tyr 
85 90 95 

15 Ser Pro Pro Phe Pro Ala Gly Asp Glu Gly Ser Glu Leu Ala Trp Arg 
100 105 110 

Asp Arg Val Ala Val Val Asn Glu Ser Leu Val He Tyr Gly Ala Leu 

115 120 125 

Glu Thr Asp Ser Gly Thr Leu Ser Val Val Gly Leu Ser Asp Glu Ala 
20 130 135 140 

Arg Gin Val Ala Ser Val Val Leu Val Val Glu Pro Ala Pro Val Pro 
145 150 155 160 

Thr Pro Thr Pro Asp Asp Tyr Asp Glu Glu Asp Asp Ala Gly Val Ser 
165 170 175 

25 Thr Pro Val Ser Val Pro Pro Pro Thr Pro Pro Arg Gly Pro Pro Val 
180 185 190 

Ala Pro Pro Thr His Pro Arg Val He Pro Glu Val Ser His Val Arg 

195 200 205 

Gly Val Thr Val His Met Pro Glu Ala He Leu Phe Ala Pro Gly Glu 
30 210 215 220 

Thr Phe Gly Thr Asn Val Ser He His Ala He Ala His Asp Asp Gly 
225 230 235 240 

Pro Tyr Ala Met Asp Val Val Trp Met Arg Phe Asp Val Pro Ser Ser 
245 250 255 

35 Cys Ala Glu Met Arg He Tyr Glu Ala Cys Leu Tyr His Pro Gin Leu 
260 265 270 

Pro Glu Cys Leu Ser Pro Ala Asp Ala Pro Cys Ala Val Ser Ser Trp 

275 280 285 

Ala Tyr Arg Leu Ala Val Arg Ser Tyr Ala Gly Cys Ser Arg Thr Thr 
40 290 295 300 

Pro Pro Pro Arg Cys Phe Ala Glu Ala Arg Met Glu Pro Val Pro Gly 
305 310 315 320 

Leu Ala Trp Leu Ala Ser Thr Val Asn Leu Glu Phe Gin His Asp Gin 
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325 330 335 

His Ala Gly Leu Cys Val Val Tyr Val Asp Asp His lie His Ala Trp 

340 345 350 

Gly His Met Thr He Ser Thr Ala Ala Gin Tyr Arg Asn Ala Val Val 
5 355 360 365 

Glu Gin His Leu Pro Gin Arg Gin Pro Glu Pro Val Glu Pro Trp His 

370 375 380 

Val Arg Ala Pro Pro Pro Ala Pro Ser Arg Pro Leu Arg Leu Gly Ala 
385 390 395 400 

10 Val Leu Gly Ala Ala Leu Leu Leu Ala Ala Leu Gly Leu Ser Ala Trp 

405 410 415 

Ala Cys Met Thr Cys Trp Arg Arg Arg Ser Trp Arg Ala Val Lys Ser 

420 425 430 

Arg Ala Ser Ala Thr Gly Pro Thr Tyr He Arg Val Ala Asp Ser Glu 
15 435 440 445 

Leu Tyr Ala Asp Trp Ser Ser Asp Ser Glu Gly Glu Arg Asp Gly Ser 

450 455 460 

Leu Trp Gin Asp Pro Pro Glu Arg Pro Asp Ser Pro Ser Thr Asn Gly 
465 470 475 480 

20 Ser Gly Phe Glu He Leu Ser Pro Thr Ala Pro Ser Val Tyr Pro His 

485 490 495 

Ser Glu Gly Arg Lys Ser Arg Arg Pro Leu Thr Thr Phe Gly Ser Gly 

500 505 510 

Ser Pro Gly Arg Arg His Ser Gin Ala Ser Tyr Ser Ser Val Leu Trp 
25 515 520 525 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1160 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Val He Arg Arg Pro Val Arg Pro Phe Gly Arg Thr Ala His Pro Ala 
40 1 5 10 15 

Ser His Gly Pro Ala Ala Val Ser Val His Arg Val Arg Ala Thr Val 

20 25 30 

Thr Leu Val Pro Met Ala Asn Arg Pro Ala Ala Ser Ala Gly Ala Arg 
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35 40 45 

Ser Pro Ser Gin Glu Pro Arg Glu Pro Glu Val Ala Pro Pro Gly Gly 

50 55 60 

Asp His Val Phe Cys Arg Lys Val Ser Gly Val Met Val Leu Ser Ser 
5 65 70 75 80 

Asp Pro Pro Gly Pro Ala Ala Tyr Arg lie Ser Asp Ser Ser Phe Val 

85 90 95 

Gin Cys Gly Ser Asn Cys Ser Met He He Asp Gly Asp Val Arg His 
100 105 110 

10 Leu Arg Asp Leu Glu Gly Ala Thr Ser Thr Gly Ala Phe Val Ala He 
115 120 125 

Ser Asn Val Ala Ala Gly Gly Asp Gly Arg Thr Ala Val Val Gly Gly 

130 135 140 

Thr Ser Gly Pro Ser Ala Thr Thr Ser Val Gly Thr Gin Thr Ser Gly 
15 145 150 155 160 

Glu Phe Leu His Gly Asn Pro Arg Thr Pro Glu Pro Gin Gly Pro Gin 

165 170 175 

Ala Val Pro Pro Pro Pro Pro Pro Pro Phe Pro Trp Gly His Glu Cys 
180 185 190 

20 Cys Ala Arg Arg Asp Arg Gly Ala Glu Lys Asp Val Gly Ala Ala Glu 
195 200 205 

Ser Trp Ser Asp Gly Pro Ser Ser Asp Ser Glu Thr Glu Asp Ser Asp 

210 215 220 

Ser Ser Asp Glu Asp Thr Gly Ser Gly Ser Glu Thr Leu Ser Arg Ser 
25 225 230 235 240 

Ser Ser He Trp Ala Ala Gly Ala Thr Asp Asp Asp Asp Ser Asp Ser 

245 250 255 

Asp Ser Arg Ser Asp Asp Ser Val Gin Pro Asp Val Val Val Arg Arg 
260 265 270 

30 Arg Trp Ser Asp Gly Pro Ala Pro Val Ala Phe Pro Lys Pro Arg Arg 
275 280 285 

Pro Gly Asp Ser Pro Gly Asn Pro Gly Leu Gly Ala Gly Thr Gly Pro 

290 295 300 

Gly Ser Ala Thr Asp Pro Arg Ala Ser Ala Asp Ser Asp Ser Ala Ala 
35 305 310 315 320 

His Ala Ala Ala Pro Gin Ala Glu Val Ala Pro Val Leu Asp Ser Gin 

325 330 335 

Pro Thr Val Gly Thr Asp Pro Gly Tyr Pro Val Pro Leu Glu Leu Thr 
340 345 350 

40 Pro Glu Asn Ala Glu Ala Val Ala Arg Phe Leu Gly Asp Ala Val Asp 
355 360 365 

Arg Glu Pro Ala Leu Met Leu Glu Tyr Phe Cys Arg Cys Ala Arg Glu 
370 375 380 
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Glu Ser Lys Arg Val Pro Pro Arg Thr Phe Gly Ser Ala Pro Arg Leu 
385 390 395 400 

Thr Glu Asp Asp Phe Gly Leu Leu Asn Tyr Ala Glu Met Arg Arg Leu 
405 410 415 

5 Cys Leu Asp Leu Pro Pro Val Pro Pro Asn Ala Tyr Thr Pro Tyr His 
420 425 430 

Leu Arg Glu Tyr Ala Thr Arg Leu Val Asn Gly Phe Lys Pro Leu Val 

435 440 445 

Arg Arg Ser Ala Arg Leu Tyr Arg He Leu Gly He Leu Val His Leu 
10 450 455 460 

Arg He Arg Thr Arg Glu Ala Ser Phe Glu Glu Trp Met Arg Ser Lys 
465 470 475 480 

Glu Val Asp Leu Asp Phe Gly Leu Thr Glu Arg Leu Arg Glu His Glu 
485 490 495 

15 Ala Gin Leu Met He Leu Ala Gin Ala Leu Asn Pro Tyr Asp Cys Leu 
500 505 510 

He His Ser Thr Pro Asn Thr Leu Val Glu Arg Gly Leu Gin Ser Ala 

515 520 525 

Leu Lys Tyr Glu Glu Phe Tyr Leu Lys Arg Phe Gly Gly His Tyr Met 
20 530 535 540 

Glu Ser Val Phe Gin Met Tyr Thr Arg He Ala Gly Phe Leu Ala Cys 
545 550 555 560 

Arg Ala Thr Arg Gly Met Arg His He Ala Leu Gly Arg Gin Gly Ser 
565 570 575 

25 Trp Trp Glu Met Phe Lys Phe Phe Phe His Arg Leu Tyr Asp His Gin 
580 585 590 

He Val Pro Ser Thr Pro Ala Met Leu Asn Leu Gly Thr Arg Asn Tyr 

595 600 605 

Tyr Thr Ser Ser Cys Tyr Leu Val Asn Pro Gin Ala Thr Thr Asn Gin 
30 610 615 620 

Ala Thr Leu Arg Ala He Thr Gly Asn Val Ser Ala He Leu Ala Arg 
625 630 635 640 

Asn Gly Gly He Gly Leu Cys Met Gin Ala Phe Asn Asp Asp Gly Thr 
645 650 655 

35 Ala Ser lie Met Pro Ala Leu Lys Val Leu Asp Ser Leu Val Ala Ala 
660 665 670 

His Asn Lys Gin Ser Trp Thr Gly Ala Cys Val Tyr Leu Glu Pro Trp 

675 680 685 

His Ser Asp Val Arg Ala Val Leu Arg Met Lys Gly Val Leu Ala Gly 
40 690 695 700 

Glu Glu Ala Gin Arg Cys Asp Asn He Phe Ser Ala Leu Trp Met Pro 
705 710 715 720 

Asp Leu Phe Phe Lys Arg Leu He Arg His Leu Asp Gly Glu Glu Asn 
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725 730 735 

Val Thr Trp Ser Leu Phe Asp Arg Asp Thr Ser Met Ser Leu Ala Asp 

740 745 750 

Phe His Gly Glu Glu Phe Glu Lys Leu Tyr Glu His Leu Glu Ala Met 
5 755 760 765 

Gly Phe Gly Glu Thr He Pro He Gin Asp Leu Ala Tyr Ala He Val 

770 775 780 

Arg Ser Ala Ala Thr Thr Gly Ser Pro Phe He Met Phe Lys Asp Ala 
785 790 795 800 

10 Val Asn Arg His Tyr He Tyr Asn Thr Gin Gly Ala Ala He Ala Gly 

805 810 815 

Ser Asn Leu Cys Thr Glu He Val His Pro Ser Ser Lys Arg Ser Ser 

820 825 830 

Gly Val Cys Asn Leu Gly Ser Val Asn Leu Ala Arg Cys Val Ser Arg 
15 835 840 845 

Arg Thr Phe Asp Phe Gly Met Leu Arg Asp Ala Val Gin Ala Cys Val . 

850 855 860 

Leu Met Val Asn He Met He Asp Ser Thr Leu Gin Pro Thr Pro Gin 
865 . 870 875 880 

20 Cys Arg His Asp Asn Leu Arg Ser Met Gly He Gly Met Gin Gly Leu 

885 890 895 

His Thr Ala Cys Leu Lys Met Gly Leu Asp Leu Glu Ser Ala Glu Phe 

900 905 910 

Arg Asp Leu Asn Thr His He Ala Glu Val Met Leu Leu Ala Ala Met 
25 915 920 925 

Lys Thr Ser Asn Ala Leu Cys Val Arg Gly Ala Arg Pro Phe Ser His 

930 935 940 

Phe Lys Arg Ser Met Tyr Arg Ala Gly Arg Phe His Trp Glu Arg Phe 
945 950 955 960 

30 Ser Asn Asp Arg Tyr Glu Gly Glu Trp Glu Met Leu Arg Gin Ser Met 

965 970 975 

Met Lys His Gly Leu Arg Asn Ser Gin Phe He Ala Leu Met Pro Thr 

980 985 990 

Ala Ala Ser Ala Gin He Ser Asp Val Ser Glu Gly Phe Ala Pro Leu 
35 995 1000 1005 

Phe Thr Asn Leu Phe Ser Lys Val Thr Arg Asp Gly Glu Thr Leu Arg 

1010 1015 1020 

Pro Asn Thr Leu Leu Leu Lys Glu Leu Glu Arg Thr Phe Gly Gly Lys 
1025 1030 1035 104 

40 Arg Leu Leu Asp Ala Met Asp Gly Leu Glu Ala Lys Gin Trp Ser Val 

1045 1050 1055 

Ala Gin Ala Leu Pro Cys Leu Asp Pro Ala His Pro Leu Arg Arg Phe 
1060 1065 1070 
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Lys Thr Ala Phe Asp Tyr Asp Gin Glu Leu Leu lie Asp Leu Cys Ala 

1075 1080 1085 

Asp Arg Ala Pro Tyr Val Asp His Ser Gin Ser Met Thr Leu Tyr Val 
1090 1095 1100 

5 Thr Glu Lys Ala Asp Gly Thr Leu Pro Ala Ser Thr Leu Val Arg Leu 
1105 1110 1115 112 

Leu Val His Ala Tyr Lys Arg Gly Leu Lys Thr Gly Met Tyr Tyr Cys 

1125 1130 1135 

Lys Val Arg Lys Ala Thr Asn Ser Gly Val Phe Ala Gly Asp Asp Asn 
10 1140 1145 1150 

lie Val Cys Thr Ser Cys Ala Leu 
1155 1160 



(2) INFORMATION FOR SEQ ID NO: 24: 



15 



(i) SEQUENCE CHARACTERISTICS : 

(A) . LENGTH: 269 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

25 

Val Arg Arg Arg Leu Arg Cys Ala Arg Arg Arg Arg Gly Gly Pro Gly 

15 10 15 

Pro His His Asp Gin Leu Arg Arg Asp Ala Gly Arg Gly Ala Ala Gly 
20 25 30 

30 Pro Val Phe Arg Met Pro Ala Arg His Gly Pro His Ala Arg Val Ser 
35 40 45 

Pro Arg Gly His Ala Val Phe Arg Gly Ala Ser Val Val Val Thr Gin 

50 55 60 

Asp Glu Leu Ala Ser Val Thr Ala Val Cys Ser Gly Pro Gin Glu Ala 
35 65 70 75 80 

Thr His Thr Gly His Pro Gly Arg Pro Cys Ser Ala Val Thr lie Pro 

85 90 95 

Ala Cys Ala Phe Val Asp Leu Asp Ala Glu Leu Cys Leu Gly Gly Pro 
100 105 110 

40 Gly Ala Ala Phe Leu Tyr Leu Val Phe Tyr Gin Cys Arg Asp Gin Glu 
115 120 125 

Leu Cys Cys Val Tyr Val Val Lys Ser Gin Leu Pro Pro Arg Gly Leu 
130 135 140 
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Glu Ala 
145 

He His 

Asp Phe 

Ser Ala 

Trp Gin 
210 
Ala Ala 
225 

Leu His 
15 Gly Val 



10 



Ala Leu Glu Arg Leu Phe Gly Arg Leu Arg He Thr Asn Thr 
150 155 160 

Gly Ala Glu Asp Met Thr Pro Leu Pro Pro Asn Arg Asn Val 

165 170 175 

Pro Leu Ala Val Leu Ala Ala Ser Ser Gin Ser Pro Arg Cys 

180 185 190 

Ser Gin Val Thr Asn Pro Gin Phe Val Asp Arg Leu Tyr Arg 
195 200 205 

Pro Asp Leu Arg Gly Arg Pro Thr Ala Arg Thr Cys Thr Tyr 

215 220 
Phe Ala Glu Leu Gly Val Met Pro Asp Asn Ser Pro Arg Cys 
230 235 240 

Arg Thr Glu Arg Phe Gly Ala Val Gly Val Pro Val Val He 

245 250 255 

Val Trp Arg Pro Gly Gly Trp Arg Ala Cys Ala 
260 265 



(2) INFORMATION FOR SEQ ID NO: 25: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 347 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



30 Met Lys Thr Lys Pro Leu Pro Thr Ala Pro Met Ala Trp Ala Glu Ser 
15 10 15 

Ala Val Glu Thr Thr Thr Ser Pro Arg Glu Leu Ala Gly His Ala Pro 

20 25 30 

Leu Arg Arg Val Leu Arg Pro Pro He Ala Arg Arg Asp Gly Pro Val 
35 35 40 45 

Leu Leu Gly Asp Arg Ala Pro Arg Arg Thr Ala Ser Thr Met Trp Leu 

50 55 60 

Leu Gly He Asp Pro Ala Glu Ser Ser Pro Gly Thr Arg Ala Thr Arg 
65 70 75 80 

40 Asp Asp Thr Glu Gin Ala Val Asp Lys He Leu Arg Gly Ala Arg Arg 

85 90 95 

Ala Gly Gly Leu Thr Val Pro Gly Ala Pro Arg Tyr His Leu Thr Arg 
100 105 110 
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Gin Val Thr Leu Thr Asp Leu Cys Gin Pro Asn Ala Glu Arg Ala Gly 

115 120 125 

Ala Leu Leu Leu Ala Leu Arg His Pro Thr Asp Leu Pro His Leu Ala 
130 135 140 

5 Arg His Arg Ala Pro Pro Gly Arg Gin Thr Glu Arg Leu Ala Glu Ala 
145 150 155 ISO 

Trp Gly Gin Leu Leu Glu Ala Ser Ala Leu Gly Ser Gly Arg Ala Glu 

165 170 175 

Ser Gly Cys Ala Arg Ala Gly Leu Val Ser Phe Asn Phe Leu Val Ala 
10 180 185 190 

Ala Cys Ala Ala Ala Tyr Asp Ala Arg Asp Ala Ala Glu Ala Val Arg 

195 200 205 

Ala His lie Thr Thr Asn Tyr Gly Gly Thr Arg Ala Gly Ala Arg Leu 
210 215 220 

15 Asp Arg Phe Ser Glu Cys Leu Arg Ala Met Val His Thr His Val Phe 
225 230 235 240 

Phe Val Met Arg Phe Phe Gly Gly Leu Val Ser Trp Ser His Arg Thr 

245 . 250 255 

Ser Trp Leu Asp Pro Ser Ala Ala Asp Pro Arg Arg Pro His Thr Pro 
20 260 265 270 

Ala Thr Arg Ala Gly Pro Val Arg Pro Leu Pro Ser Arg Pro Ala Pro 

275 280 285 

Leu Trp Thr Trp Thr Pro Ser Cys Ala Trp Gly Ala Leu Gly Arg Arg 
290 295 300 

25 Ser Cys Thr Trp Phe Ser Pro Thr Asp Ser Ala Gly Thr Arg Ser Ser 
305 310 315 320 

Val Ala Cys Thr Trp Ser Arg Ala Ser Ser Pro Arg Ala Asp Trp Arg 

325 330 335 

Arg Pro Ser Ser Gly Cys Ser Gly Ala Ser Gly 
30 340 345 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 12701 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 



GTAGAAGTAA GGAGAGAGTG AGAGTATAGA GAAATAGATG AGAGAGAGAA GGTNAGATAA 
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ATAAGAGGGA CAGAGAGTAG GGAGTAGGGT TAGAGAGGAT GAGGGAGAAA GATGAGAGAA 120 

GAGGGGAAGA TATATAGATG AGAGAAGGGA AGGAATGAGA GAGGGAAGAA TGATGGAAGG 180 

GAAGGAGNAG AGAAGAGTAG TAGAGGGAAA AAAGAGAAGG AGAAAAGNAA GGAAAGAAAT 240 

AGAAGAGNAG AGAAGGAAGA GAGAAAGAGA GAAGGAAGGA GGGGAAAGAA GAAANAAGAA 300 

5 GGGAAGAAGG AAAGGAAGAG NAGAAGAAGG AAGAGAAGAG AAAGAAGAAG NAGAGGGGGA 360 

AGAAAAGAGA GGAAGAGGGG AAGGAGGAAA GGAGGGAGAA GAAAAAANAA AAAACCACAC 420 

GGCGGCCGAA ACGTCGGGGG AACCGGTAGA AGTCCTGCAG GTCGGACGAA CCAACGGACA 480 

CCTCCGCAAA GCGCGCGCGC GCCTCCCCCG CGGCGTCGCG ACAGACCAGA TACAGCAGGG 540 

CGTGGAGGGA GTCGCGCGTG CGCGGGGGCA GCCATACCGC GTATAGGGTA ATGGCGCTGA 600 

10 CGCTCTCCTC CACCCAAACG ATGCCGGGGG CTTCCATGCC ACGACGCCCG GGGGTTGCCG 660 

TGTATCGAAC GAGCGCGGCC CCAGACTTAT AGGGTGCTAA AGTTCACCGC CCCCTGCATC 720 

ATGGGCCAGG CCTCGGTGGG AAGCTCCGAC AGAGCCGCCT CGAGAATGAT GTCAGTGTTG 780 

GGCTGGGCGC CGGAGGCGTG CGTGCGCAAG CAGCGCCCCC ACGCGGGCGC GCGCAGCTTG 840 

AAGCGCGCGC CCGCAAACTC CCGCTTATGG GCCATCAGCA GCGCGTACAG CTGTCTGTGC 900 

15 GTCCGGCAGG CGCTGTGGTC GATGCGGTGG GCGTCCAGCA GCTCCACGAT GGCTCGCTTG 960 

GTGAGGTTTT TAACGCGCCC CGCCCCGGGA AACGTCTGCG TGCTCTTGGC CAGCTGCACC 1020 

CCGAACAGTT CGCCCCAGAT GATCTTGAAC AGCGACAGCG CGTGCTCCGT CTCGCTCACG 1080 

GACCCGCGCG GGGGGCAGCC GCTCAGGGCG TCGGCCACGC GCTTAACCGC GTCCTCCGAC 1140 

AGCAAGGGGC CGTCGGTCAC GTTACAGTGG CCCAGTTCGA ACACCAGCTG CATGTAGCGG 1200 

20 TCGTAGTGGG GGTTCAGCAG CTCCAGCACG TCCTCGGGGC TAAAGGTTCG CCCCGACCCC 1260 

CCGGCCATCG AGTCCCACTG CAGGCACGCG GCCATGGTGC TGCACAGACG GAACAGCTCC 1320 

CAGACGGGGG CGACGTTTAG GGTGGGGTGT AGGGCCACAA GCTCCAGCTC TCCGGCGGCG 1380 

TTGATCGTGG GGATGACGCC CGTGGCGTAG TGGTCGTAAA GCCGCCGGAA GATGGCGCTG 1440 

CTATGGGCGG CCATGGGGAC GCGAAGACAG GCCTCCAGCA GCACCAGGTA GATGAACCGC 1500 

25 GTGCGGCCGA CCAGGCTGTT GAGGCCGCGC ATGAGCGCGA CCACCTCGGC CGGCGCGACG 1560 

TCCGGCCGGA GGTACTTTTC GACGAAAAGG CCCACCTCCT CCGTCTCGGC GGCCTGGGCC 1620 

GACAGGGACG TGTCGGGGTC CTGGCAGCGC AGCTCCCGCA GATCCCGCTG GGCCCTCAGG 1680 

GCATCAAAAT GTATCCCCCG CAAAAACAGA CAAAAGTTCC TCGGGGTCAG CGCGGCGTCG 1740 

TGGCCCCAGA ACCGCACGTG CATGCAGTTG AGGGTCAGAA GCATGTGGAG GATGTTAAGA 1800 

30 CTGTCCGCGA GGCACGCCAG CGTGCACCTC TCGAAGTAGT GCTTGTACCG GAATTTGCTG 1860 

TAGATGCGCG ACCCCCGCGC CTGCGCCGCG TCGGCGTGCG ACGCGTCGCA GCGCCCTTTG 1920 

AACCGGCGGC ACAACAGGTT CGTCACCTGG GAAAACTGTG CCGGCCACTG CCCGCTGGCG 1980 

CTCACCACGT GGTTGAGCAG CATGGGCGTA AAGACGGGCT CCGAGCGCGC CCCGGACCCG 2040 

TCCATGTAGA TCAGCAGCTC CCCCTTGCGG AGAGTCCGTA CCCGCCCCAG CGACTGGTAC 2100 

35 ACGGACACCA TGTCCGGCCC GTAGTTCATG GGTTTCACGT AGGCGAACAT GCTGTCAAAG 2160 

TGCGGCGGAT CGAAGCTAAG GCCCACCGTC ACGACCGTTG TGTAAATGAC CACCCGGTAC 2220 

CGGCCCCATG TGGTCACTTC GCCGGGCGGG GTGAGCGAGT GGAGCAGCAG CACGCGGTCC 2280 

GTAAACTGCC GGCAGAACCT GGCAACGACC TCCGCGAAGG AGACCGTCGA CAAGAAGATG 2340 

CAGACGTTAT CTCCGCCGGC CAGGCGCGCC TCCACCTCCC CGAAGAAGGT GGCGTCCGGG 2400 

40 GGGGCGTCCG GGGGGGGCGC CCCGCCCGCC GGCCCCGGCG GGCGCAGGGC CGCCTGCAGG 2460 

ACCTCGGGCC CCAGGCGCGG GAGAAACAGA CAACGGCGCG CCGAAAATCC GGGCATGGCA 2520 

TACTCCCCGA TGACCACGTG AACGTTCTTT TCGCCCCGGA GGCTGCACAG AAAGTCCACC 2580 

AGCTGCGCGT TGGCGGTGGC GTCCATGGCG ATGATCCGCG GGCACGTGCG CAGCAGGCGC 2640 
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AGCATCAACG CGTCGACGCG GCCCAGCTGC TGCATCGTCG GCGAGTACAG TTGGCCCAAC 2700 

GTCGACATGA CTTCGTCCAG GACGAGCACG TCGTAGTTGT TCAACAGGTT CGGGCCCACG 2760 

CGATGAAGAC TTTCCACCTG CACGATGAGA CGGTGGAAGG GGCGGTCGTT CATGATGTAA 2820 

TTGGTGGATG AGAAGTAGGT GACGAAGTCG GGCAACCCTG ACTCAGCGAA CCGCGTCGCC 2880 

5 AGGGTCTGAG TAAAACTCCG ACGACAGGAG ACGACCAGCA CACTCGTGTC CGGAGAGTGG 2940 

ATCGCTTCCC CCAACCAGCG GATCAGCGCG GTAGTTTTTC CCGAGCCCAT TGGCGCGCGG 3000 

ACCACAGTTA CGCACCGGGC CGTCGGGGCG CTCGCGTCCG GGAAGGTGAC GGGTCCGTGT 3060 

TGCTGCCGCT CGATCGTTGT TTTCGGGTGG ACCCGGGGAA CCCACTCGGC CAAATCCCCC 3120 

CCGTAAAGCA TCCGCGCCAG CGATACACTC GACGTGTACT GCTCGCACTC GTCATCCCCG 3180 

10 ATGGGACGCC GGGCCCCCAG GGGATCCCCC GAGGCCGCGC CGGGCGCCGA CGTCGCGCCC 3240 

GGGGCGCGGG CGGCGTGGTG GGTCTGGTGT GTGCAGGTGG CGACGTTCAT CGTCTCGGCC 3300 

ATCTGCGTCG TGGGGCTCCT GGTGCTGGCC TCTGTGTTCC GGGACAGGTT TCCCTGCCTT 3360 

TACGCCCCCG CGACCTCTTA TGCGGAGGCG AACGCCACGG TCGAGGTGCG CGGGGGTGTA 3420 

GCCGTCCCCC TCCGGTTGGA CACGCAGAGC CTGCTGGCCA CGTACGCAAT TACGTCTACG 3480 

15 CTGTTGCTGG CGGCGGCCGT GTACGCCGCG GTGGGCGCGG TGACCTCGCG CTACGAGCGC 3540 

GCGCTGGATG CGGCCCGTCG CCTGGCGGCG GCCCGTATGG CGATGCCACA CGCCACGCTA 3600 

ATCGCCGGAA ACGTCTGCGC GTGGCTGTTG CAGATCACAG TCCTGCTGCT GGCCCACCGC 3660 

ATCAGCCAGC TGGCCCACCT TATCTACGTC CTGCACTTTG CGTGCCTCGT GTATCTCGCG 3720 

GCCCATTTTT GCACCAGGGG GGTCCTGAGC GGGACGTACC TGCGTCAGGT TCACGGCCTG 3780 

20 ATTGACCCGG CGCCGACGCA CCATCGTATC GTCGGTCCGG TGCGGGCAGT AATGACAAAC 3840 

GCCTTATTAC TGGGCACCCT CCTGTGCACG GCCGCCGCCG CGGTCTCGTT GAACACGATC 3900 

GCCGCCCTCA ACTTCAACTT TTCCGCCCCG AGCATGCTCA TCTGCCTGAC GACGCTGTTC 3960 

GCCCTGCTTG TCGTGTCGCT GTTGTTGGTG GTCGAGGGGG TGCTGTGTCA CTACGTGCGC 4020 

GTGTTGGTGG GCCCCCACCT CGGGGCCATC GCCGCCACCG GCATCGTCGG CCTGGCCTGC 4080 

25 GAGCACTACC ACACCGGTGG CTACTACGTG GTGGAGCAGC AGTGGCCGGG GGCCCAGACG 4140 

GG AGTCCGCG TCGCCCTGGC GCTCGTCGCC GCCTTTGCCC TCGCCATGGC CGTGCTTCGG 4200 

TGCACGCGCG CCTACCTGTA TCACCGGCGA CACCACACTA AATTTTTCGT GCGCATGCGC 4260 

GACACCCGGC ACCGCGCCCA TTCGGCGCTT CGACGCGTAC GCAGCTCCAT GCGCGGTTCT 4320 

AGGCGTGGCG GGCCGCCCGG AGACCCGGGC TACGCGGAAA CCCCCTACGC GAGCGTGTCC 4380 

30 CACCACGCCG AGATCGACCG GTATGGGGAT TCCGACGGGG ACCCGATCTA CGACGAAGTG 4440 

GCCCCCGACC ACGAGGCCGA GCTCTACGCC CGAGTGCAAC GCCCCGGGCC TGTGCCCGAC 4500 

GCCGAGCCCA TTTACGACAC CGTGG AGGGG TATGCGCCAA GGTCCGCGGG GGAGCCGGTG 4560 

TACAGCACCG TTCGGCGATG GTAGCCGTTT CGTTCGTTTT AATAAACCGA CGTTGTGCGT 4620 

TTCACCATAC TTCGGCGCGC GCGTGTGTGT GTTTTTTTTG TGGTGTTTAT TTTCCCCCAC 4680 

35 CCCTTCCTTT TCTTTCGGCC ACCACCCCCC TCCTCCCCCG TACTATACAA CAAAAAATAC 4740 

CACACATACG ACCAAATACG GACAATCATT TCTGTCTTTA TTCGCTGTCA GAGAGTGGGG 4800 

GCGTGAGCGT GGCAGGAGGG CGGGCCACGT CGGGGTCCCG CCGTCTGGTG TGACGCGATG 4860 

GGGGGTCCGA TGCGCGCCGG TACTGGGGCC CCGGCGCCCG GGTGACCACG CGCATGTCGG 4920 

GGGGCACGTA GAAGTTACCC TCTTCTTCGG ACTCGATGTC CACGACGTCA AATTCGTGGG 4980 

40 CGGTCAGCGA GACGACCTCC CCGCCGTCGG TGATGATGAC GTTGTGTCGG CAGCAGCAGG 5040 

GCCGCGCCCC GGAGAACGCG AGGCCCATAA CTTGGCGAGC GTATCGTCGA AGGCCAGGCG 5100 

GCTGTTTCGC CGGATGTCCC GGTATATCCC CGGCTCGACG CGGACGGGGG TGATGATCAG 5160 

GGCGATCGGA ACGGCCTGGT CCGGGAGGAT CGATGCCTTG GCGGGTCCGG GGGCCCCGCC 5220 
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ACGCCCGGCG GGCGCTCCGC GGCCGTCCTC 
CGCGCGGTGC CTGCCGAGGA ACGTCACCAG 
GCTGTCGAGG GACGTTTCCC TGCACCAAGA 
CGAAGATGGG CTCGCGGCGA ACCAGCTCCC 
5 GACGCTCGAG GTCGGGGACG CCAAACAGAA 
CCACCAGCGC CCGATCCGGG GCGGAGCATC 
GCCAGTCCCG GTCTTGGGTG ACGAGCGCCT 
AGTAGCGCAC GCCGGGGTTG GGGATGGACC 
GCCGCGCCAT CAGGTCCTCG TACGCGGAGG 

10 ACGCGTACTT GGCTCGGCAC TTAACCTCGT 
CCAGGTAGCC GTGAGGGTCC CTGGGGCACA 
CCGTGTGGCC GTCCATGAGG ACCCCGCACG 
GTTGGTGAAA GACGAAGCGC CCGGCGTCGG 
CCACGCAGTA GCGAAACAGC AGGTTTCGGG 

15 CCGCCGACGA CTGGGCGTCC AGCCGCAGGC 
AGCACGGACC CTGCGCGCCC CACCGCAGCG 
GGGCCCAGAG CTGGCAGTCG GCCTGGTTTT 
GGCGGGGGGC GACGGCTTCG GCGGCGGACG 
GCCGGCCGAG CCCGCGGTCC ACCATGCCGG 

20 GATAGTCCAG GCGAGCCCAC AGGGGCCCGA 
GGCCGCGCAG GTGGCGCTCG AACGTTTCCG 
TCGCCGACGC CGACCACATC GGGTCGGGGT 
TGGCGTGTGC GCCCCCCGGC GAGAGGGGAA 
CAGAGAGGGC CGGGGACGCG GGCCGGGCCT 

25 CACGTGGGGG GCTCTGGGGC CAATGGGAAC 
GGGCGGGGCG GGGCCCAAAG ACGGTCGCCA 
GGGGACTATC GGGGTCGCGG GCGGGGTCCG 
CCGCCATTTT TACGAGCAGC CGAAGAGCTC 
TGGCGCGCGG CCGGGTTGGC GTGACAGAGG 

30 TCGGGCGGCA GCGACACCGA CGACAGGACG 
TGCTCCGTGA ACGCGCGCCG AATCTTGGGA 
ACGTCATACG CCAGGCCGTG GGTGTTGGTC 
AGAACGCAGC GATAGGCGAG GAGGGCCACG 
TACTGGTAGC CCGGGACGCG GGTCACGGGG 

35 ACCAGCAGCT CCAGCAGCGT CTGCCCCAGG 
TGCTTCAGGG GGCGGTTGTT AAACTCGGCC 
TCCGGCGGCT GGTTGTACCC GTGCCCCACC 
GCGGGCATCC CAAACCCCCG GGGGGACTCG 
CGGGATATCG TGGAGTTGGA GTTCAGGGTC 

40 CGGAGCGACA CCGCGTCCGA TCGCAGCATC 
TGGCTGATCC CGCACCTGGT GTTCAGGAAC 
TGGTGGAGGG CCGTCGCGAC GGAGGGGGTG 
TACTTGCCGA GGTCCATGTC GTACGCGGGG 
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CAGGCGGAAC 


GTCACGCCCT 


CCTCCGCGCC 


5280 


GTGCGGTTGC 

Vrf X \JV.\JU X X VJ 


AGGGGGCAGT 


CGGGAAAGTG 


5340 


TCTGTTTGAA 

X V-* X VJ XXX VJ/lXl 


GTTCGGGTGG 


CGGGGGTTGG 


5400 


CGGAGCTCCA 


GGCCACGGGA 


GAGATGGTGP 


5460 


GCACCTCCGA 


GACAACGCCG 


CTATTTAACT 


SS90 


sJV^W X X X X ± ± v_ 


GCCGGCGGCG 


CGGGAATCGA 


JJOU 


CCTCCGGGCC 


CGGGACGCGC 


CCGGGCGCGA 


^640 


GGATGAACGC 

\JVJX» X XJaUiX^vJV^ 


CCGGAACGCC 


TCCGGCGATC 


S7oo 


CCGCGGGGGC 


GCCGGGGTCC 


GCGGGGTCGA 


^760 


AGAAGGCCAG 


GGGGGTCTGG 


GGGGCGGGGG 




CGAGGATGTC 

V*. W*X Vp*VJ*x X X v^ 


CAGGGACGCC 


CCCACCATGC 


SftfiO 

JOOU 


CGTGCACGTT 


CTCCTCGGCG AGGTCCCCGG 




CGTCGTCGTT 


GACGCCCGCG 


TCCGCGCGGC 


6ooo 


CCGTCGGCTC 


GTTCACCCGC 


CCGAACATCA 


6060 


TGGCGTTGTG 

X VJVJ V* \J X X VJ X VJ 


GGTGAGCCAC 


TGGGACGAGA 


61 90 


TGGAGGCGGT 


CGTCAGGCCC 


CGCCGAAGCA 


61 AO 
OJ.OU 


GCGTCGCPGP 


CTCGTAAAAT 


CCCATAAGCG 


6*> AC\ 


GGGGGGCGCG 


GCGCGTCAGG 


CGCCAGAGGT 


D jUU 


CCGCCTCCAG 

V* V-* VJ \^ V^ X V— V» ^J.VJ 


CGACACGACG 


AGGGAGCACA 


67 6o 


TGGCCAGAGG 


GGAGCGGACG 


CCGCGCAGCA 


6490 


CCAAGATATG 


GGGGGGCAGT 


GCGTTGGGGA 


64fl0 


CCGGGGGACC 


GGGGCTGCAG 


TCCGGGTCGA 


"6R40 


TGTCGGGGGT 


TGGCGGGCCG 


GATGAGGCCT 


6600 


TTTCGCCCGG 


GGCCCCGCCG 


TCGGGTTGCC 


6660 

O D DU 


CCGGGGCCCC 

\-* *w VJ VJ VJ VJ V— v_ v_ 


CGGTGAAGTG 


GGGCGGGGTG 


6790 


GATCTAGGCT 


GTTGGGTCGG 


GGCCGCTTCG 


67P0 


CGGGGCGCTT 


GGCGCCGGGT 


GTTGCGGCGG 


6ft AO 


GAGGGCGGAA 


GGGATCCTCA CGACAGAGAG 


6qoo 


CGGGAGACCA 


GCACCAGCAG 


CGGCCTCAGC 


6Q60 


GCCTTGTGCG 


TGCGCTGGTA 


ATTTATACAC 


7090 


TTGCGAAGGT 


GGCGCCGGAT GCCCTCCGGC 


70fto 

/ U O w 


TCGGCCGAGT 


TGACAAAGAG 


GGCGGGGTGC 


71 40 


GCAAAGTCCG 


GCGAGAGCTG 


GTTGTTAAAG 




ACGCCCAGGC 


TCGGGGCCAC 


GTACACGCTA 


79 60 


GCGTAGAGAT 


CGACCGCCAG 


CCCGACGTCG 


7790 




TGAGGTACTT 


TACCAAGAGC 


7380 


AGAGTGTGAA 


AGTTGGCCGT 


GGTCAGGGCG 


7440 


AGGTCCGGCT 


CCTGGAGGCA AAACTGGCCC 


7500 


ACCAGGCTAA 


AGTCGGCCAG 


GACGGCCCGC 


7560 


ACGAGGACGT 


TGGCGCACTT 


GATGTCCAGG 


7620 


ACCACGGCGC 


GCGCCAGGTC 


TGTGAAGCAG 


7680 


GTCGCGCGCA 


GGGACGCCAG 


CTGGCCGATG 


7740 


AACACGATCT 


GGCGCTGCTG ■ 


CAGCGAGAAC 


.7800 
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CCGAGCGGGG TGATAAAGCC GCGGATGTCG 
CCCACGAGCA GGGTCGCGAC GAGCTCCACG 
ACGGCCAGCT TGTGTTCGCA AATCAACTGC 
CTGCGGGCCC CGGGGATCTC CAGGGTCGTG 
5 GGGATGCATA GCTTGTGGAT GCGCGCGAGG 
GAGGTCATGG CCGTCTCGGA CCTGCGCAGG 
GGGGCCTCGG GGGACGAGCG GCGACGAGAC 
GCGCGAAGCC GCTCCCGGAA GCTGGATCGG 
GCGCCGTCTC GGGGGGAGGG GCCGCTTGGG 

10 ATGTAGGACG CGAGCCAGGC CTTGAAGGAG 
GCTGCCACAT GACTAGCAGG TCGCTGTCGC 
CGTCCCCCCA CAGAGACGCG TTCGCCGCGG 
GACGATCGTC CGCCGCGTCC AGGCGCTCGC 
CGGTTAGAAA ATCACGTCGC GCCGCTTGCT 

15 GCTGTCGCAT CATCTCTAAG CGCGCGCGGG 
CCGCCTTGGC GGCCATAAAG GCGCCAACAA 
GGTGTAGCTG CAGGGTCTGG TCCCTGTACA 
GTCGGCGCAG GGCCGCGTGG CTGGCGTCTC 
ACGTCGGACT CCTTCGCCCC GACCCCCCTG 

20 TTTGGCCAGC AGCTGGCGTC CGACGTGCAG 
CAACAGAAGG TGGGCGTCGA CGAGGCGTCG 
CGCGTCCCTT TTTTGGATTT TGCCACCGCG 
GGCGTCGGGA CGCTCCACGA CTGCTGCGAG 
CGGTTGCTGT TTAATAGCCT GGTGCCGGCG 

25 CACACGGCCA AGCTGGAGTT CCTGGCCCCC 
TTTCGGGAGT GCGCGCCGGA GGACGCCGTG 
AACACGTTTC AGGCCCTGCA CCGCTCCGAA 
GACTTCGCCC AGTTGTTGAA AACCTCGTTC 
CCCCCGAAGA AACGGGCCAA . GGTGGACGTG 

30 GAGCTCTTCC AGAAAATGAT ACTAATGCAC 
GGGGACCACG CGGAGCAGGT CAACACGTTC 
AGCGACACGG CCGTGCGGCA CTTCCGCCAG 
CACGGAAAGA CCTGGTTTTT GGTGCCCCTC 
ATCAAGATAG GCTACACGGC CCACATCCGC 

35 GACGCCTGCC TGCGGGGCTG GTTTGGCTCG 
ATCTCGTTCT CGTTCCCGGA CGGCTCGCGC 
ACGAACGTAA GTACGCCTTC CTCCCGCGGT 
ATCGACCGAC AGACAAACAC AGCCAGACGC 
CCGCCATGGC GGGGGGAAGC CTTACTGTTT 

40 CGGCCCGCGC GACCGCGGGG CAGCTCGTTG 
CGCAGAGGCG CCACCCGGCG CTGGTCGGGC 
CGACGACCTC GTGCAGGTGG GCCGTGATGC 
CCGCGTCCAC GGGGTGCCCG AAGAGGAGCT 



TGGGTGCGGC CGCCGCAAAA AGCGCACTCC 7860 

GCAAACCACT CCTTTTCCCG GATGGTCTTC 7920 

ACCTCGCCGT ACCCCCCCGA GCCCCCGAAG 7980 

TAGCGGAGGG CGGGGTTGAC GGCGAATACG 8040 

GACAGGATGT GCGAGGGGGG CGACGGGGGC 8100 

GGCGGGCGCC TCAGCTTGGC CGCAGGGCCG 8160 

GAGCGGCTCA CTCGCCATCG GGACAGTCCC 8220 

CGGCGGGACC CGGGGCGGGC TCCGGAGACG 8280 

CGTCCGGACG CCCGGCGGCT GAGGGAGTGT 8340 

CGTCGGTGTG CACCTTGGGG GCTGATGTCA 8400 

CCGGACTCAT CCATCCGTCC GCCAGGTCGC 8460 

CCTCTTCGAG CTGCTCCTCC TGGTCCGCAA 8520 

TAAGCGCGGG ATCGAGGTAC CGTCGGTGTG 8580 

CTTCCACGCG AATTTTAACA CAGGTCGCTC 8640 

ACTTTAGCCG CGCCTCCAAT TCCAAGTGGG 8700 

ACCTAGGATC TTGTGTACTC ACGCCCTCCC 8760 

CCTCGGCCCG GAGGTGCGTC TCGGCCAAAC 8820 

GGCTCATCTC GCCGCCCCCG CGCGCGCCCG 8880 

ACCTCAGCCG CCCCCGCCTC GCCCGCGATG 8940 

CAGTACCTGG AGCGCCTGGA GAAACAGAGG 9000 

GCGGGCCTGA CGCTCGGCGG CGATGCGCTG 9060 

ACGCCCAAGC GCCACCAGAC CGTGGTCCCG 9120 

CACTCGCCGC TCTTCTCGGC CGTCGCGCGG 9180 

CAACTCAGGG GGCGTGACTT TGGGGGCGAC 9240 

GAGCTGGTGC GGGCGGTGGC GCGCCTGCGG 9300 

CCCCAACGCA ACGCCTACTA CAGCGTCCTG 9360 

GCCTTTCGGC AGTTGGTTCA CTTCGTGCGG 9420 

CGGGCCTCTA GTCTCGCGGA GAATACGGGC 9480 

GCCACCCACG GGCAGACGTA CGGCACCTTG 9540 

GCGACCTACT TTCTGGCCGC CGTGCTGCTC 9600 

CTGCGGCTCG TGTTCGAGAT CCCCCTGTTT 9660 

CGCGCCACCG TGTTTCTAGT CCCCAGGCGC 9720 

ATCGCGCTGT CGCTCGCGTC CTTCCGGGGG 9780 

AAGGCGACCG AGCCCGTGTT TGATGAGATC 9840 

TCCCGGGTGG ACCACGTCAA GGGGGAAACC 9900 

AGCACGATCG TGTTTGCCTC CAGCCACAAC 9960 

GCCTGTTTCC CCGGTGCCGC CCTCCCCGAG 10020 

GAGTGTGGGA CGACACGCCC GCAGCCCCCC 10080 

ATTTGTAATC GGACGATGAG GCTCTGGCCA 10140 

CAAACAGGCG GCTGGTATAC GATGACAGAA 10200 

GGATGACGCT TTCCGCGCCG TCCCGGCCCA 10260 

GCGGGCGGCG GGTCGCCTGC CGCAGGATAA 10320 

GACACAGGCT CGCGTCCCCC CGGACGGCCA 10380 
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GGGTGCGCTG GGCCATATTG GACCACATGC 
CGGCGGGGGC GCGCCACAGC GCGTTGGCGG 
CGCCTCCTCC CGGGGGGTCG GTAATCCTGG 
TGCCCGGGGG ACAGAGCGAC CCCAGGTCAT 
5 CGGGGAGGTG CCACCAGGCC CCCGGACCCA 
GTTCCGTGGG TACCAGGTAG GCGCCGTCGA 
GTTCGGCGGC GGGGTCGGGG GTTTCCTCCG 
CTAGGGTGCA CAGCAGCGGG GTCCGGGGGT 
AGTAGCGGCG CTCGCGGTTA AAGAAAAAAA 

10 GCGCCTTGGG CCGCGTCAGG TACAGGAAAA 
GGTCCGGAAG GGCCACCTGG CACAGCGGCT 
TAAGCCGCTC GTCCCCCCGA ACGACGCGCC 
CAAGGTCGGC TTCGGGCCCC GGGTCGGGGG 
CCGGCGGGGC CGCGGCTCCC GGGGGGCCTG 

15 GTGCCATGTT GGTGGTGGGG AAGGGACCGG 
TGTATGACTT GGGGGGGGGG GTGGGTGACC 
GGCCGGGTCC CGTACCCGCC CCGCGACAGA 
CTGCGAGGCT GAGGTACGCC GCGGTGTTAA 
CTAGCCCGCA GAGGCGGCGA TTGAACCCAA 

20 GGTATTGCTC GCAAACCCTG TGCGGGGCAG 
TACTCTGCTC GCGTCCATTG ACGTCACCGT 
GGCGCAGCGT GTCTCCGCTG GTGCTGTAGT 
CGAAGCGGGC GGGGATGTCG TCGCTGAGAG 
CCTGGCCGCC CAGATGCGCC AGCACGGCCA 

25 GGGCGGTCCC CTCGAGGGCG CGCATCAGGT 
CCTCCCCTAG CCACTCGCTC TGGTGGGGGC 
GCGGGCCGCC TTGGAGCGCG GCCCGGATAG 
GGATGCGCGC GACGAAGGCC ACCTCGGCCG 
GGTGGCGCAG GGGTCCCTCG AGCGCGGGAA 

30 GGGACAGCTG GTGGGGGCGC ACGACGCGCT 
CCAGATAGGA GGACAACAGC GGGGGGGGGG 
TGTAGGAGAC GACGACGAAG CGCTGCTTGG 
TGGAGCGACG GCTCCACAGC CAGTCGGGCC 
GGTCCAGCCA CTCGACCAGC GATCGCGGCT 

35 CGTTGAGGAC GTCATCGCCC GCGGCCCGGG 
CGGGCGGTTC CAGGCCCGCC AGCACCGCCT 
TGACGGCNTG GTGGACCAGG GCGCCCTGGC 
GCTTGGGGCG CGCGCGGACC GGGTGGCGGC 
GCTTGTCGAT ACCGTGGACT CTGAAGTAGC 
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APGGGGCGAP 


GPAGGGACAG 


GC C TPPP PP A 


10440 


AATC GATGTG 


GGPCGTCGGG 


GCGPAPPPPP 


toson 


AT AGC AGC P A 


TPPTAAATGG 


CGGGCTTGGC 


IvJDU 


PATPCATGGP 


PPAGCAGTAT 




i neon 


GGGCAPAGPA 


CGGGCCCGGA 


X X \.V9UvWUv 


1UDOU 


GCTCGTGGG C 


PAPPGGPTPP 


TPPPPr* ar , r , T 




V3UUUOvn\3(J v„ 


APPTTPPAGP 


TPP ccc a a r^p 




PPPTTAPPPT 


PPPPAGGTGP 




1 nft fin 


TfJRrAAAAAA 
x uVJ^.An/uvin 


PPTPTTPPA A 


PPP APPPPP A 


t noon 


x \» x LVj^Annn 


A APPPP APPP 




11)5 OU 




ppTpapppap 


HaAAAAA lux 


iin/in 
1XU4U 


apappa a aap 


apaPTTPppp 


a T'f^TT^r^/^TT' A 


in nn 

1XXUU 


CGCGCGnGTc 






111 fin 

JL JL JL OU 


ppptpppptp 


PPP APPPP AP 


1 Ot-U-CoU X 


1 1 o o n 


AG APPP A CCA 


A A AGC AG A GG 


pppp Acr'ccr' 


1 1 9fln 


GGTGGAAPAA 


AAAPAPPPPT 


p Appppapa a 


1 1 "\AC\ 


appppaptpp 


PA PPPP A PPP 


pppapppppT 1 


114 nn 


TGGTAAAPPP 


AAAPPPTPPP 


pnaa apappa 


1 1 a fin 


GGCAGAPPTA 


PPPPTAPPTP 


TPTPPPppa a 


11 D Z U 


TGGAGGGPPT 


PPPPTPPATP 


a APPpapaTT 1 


i i con 


PAATPAPPAP 


TPPPATTPPA 


rrnwpT a a 


1 1 fi An 

IX 0 *4U 


APTPAAAPPP 


PTAPTPPPPP 


Trpr&pTppp 

X U oVj/\VJ 1 UoVj 


1 1 Tnn 


ppapp apppp. 


V^C o V_ \j C V- VjL, 




1 1 *7 £n 

XX / DU 


ppppptappp 


pptptpa a ar 




1 1 Qon 
xxozu 


tptppappap 


papppppa ap 


pp.pppppT»pa 


1 1 Pfln 
1 x oou 


P A A A G r PGG r P a 




tpp a a p a tpp 

1 VjVj AAuA 1 


1 1 q An 

1 Xl7 *4U 


APTPPPPPAP 


PPPPPPPAPA 


PAPPPPaTPT 


1 9nnn 

1ZUUU 


PP ATPTP AAA 


PPPPTPP APP 


AL Vj oi- o 


1 on fin 

1Z U D u 


APPPAPPPAP 


PA GGGCGGTC 


TPPPPPPPPP 


191 on 
xzxzu 


CGGCGGCACA 


PPPPTPPPTP 


SPPPPPPTPP 


1 *> i fin 

X Z X OU 


TGCGTCGCCC 




APPP A ATTTT 
nuLunni XXX 


1 99x10 
X z z *± u 


TC C C GT AGTG 


ATGGCGCAGG 


ACCACGGAGA 


12300 


GGTCGCCGCC 


GGCCAGAGGT 


TCCCACCCGC 


12360 


TGGCGGTCCC 


CGGCACGAGG 


GTGAGCACGT 


12420 


GGCCCCCCGG 


GGTGGCAAAG 


CGCCCCCCGC 


12480 


CCGCGTCCGA 


CGCGCCCAGG 


GCTCCCCCGC 


12540 


GGAGCCCCGA 


GGNGACGCCG 


GAGGCCGCGT 


12600 


GGGTGACGTC 


CTGCACGGCC 


CGCTGATCAA 


12660 


CCGTAAGGAA 






12701 
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(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 857 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



5 



<ii) MOLECULE 



TYPE: peptide 



(xi) SEQUENCE 



DESCRIPTION: SEQ ID NO: 27: 



10 Met Ala Glu Thr Met Asn Val Ala Thr Cys Thr His Gin Thr His His 
15 10 15 

Ala Ala Arg Ala Pro Gly Ala Thr Ser Ala Pro Gly Ala Ala Ser Gly 

20 25 30 

Asp Pro Leu Gly Ala Arg Arg Pro lie Gly Asp Asp Glu Cys Glu Gin 
15 35 40 45 

Tyr Thr Ser Ser Val Ser Leu Ala Arg Met Leu Tyr Gly Gly Asp Leu 

50 55 60 

Ala Glu Trp Val Pro Arg Val His Pro Lys Thr Thr lie Glu Arg Gin 
65 70 75 80 

20 Gin His Gly Pro Val Thr Phe Pro Asp Ala Ser Ala Pro Thr Ala Arg 

85 90 95 

Cys Val Thr Val Val Arg Ala Pro Met Gly Ser Gly Lys Thr Thr Ala 

100 105 110 

Leu He Arg Trp Leu Gly Glu Ala He His Ser Pro Asp Thr Ser Val 
25 115 120 125 

Leu Val Val Ser Cys Arg Arg Ser Phe Thr Gin Thr Leu Ala Thr Arg 

130 135 140 

Phe Ala Glu Ser Gly Leu Pro Asp Phe Val Thr Tyr Phe Ser Ser Thr 
145 150 155 160 

30 Asn Tyr He Met Asn Asp Arg Pro Phe His Arg Leu He Val Gin Val 

165 170 175 

Glu Ser Leu His Arg Val Gly Pro Asn Leu Leu Asn Asn Tyr Asp Val 

180 185 190 

Leu Val Leu Asp Glu Val Met Ser Thr Leu Gly Gin Lys Pro Thr Met 
35 195 200 205 

Gin Gin Leu Gly Arg Val Asp Ala Leu Met Leu Arg Leu Leu Arg Thr 

210 215 220 

Cys Pro Arg He He Ala Met Asp Ala Thr Ala Asn Ala Gin Leu Val 
225 230 235 240 

40 Asp Phe Leu Cys Ser Leu Arg Gly Glu Lys Asn Val His Val Val lie 

245 250 255 

Gly Glu Tyr Ala Met Pro Gly Phe Ser Ala Arg Arg Cys Leu Phe Leu 



260 



265 
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Pro Arg Leu Gly Pro Glu Val Leu Gin Ala Ala Leu Arg Pro Pro Gly 

275 280 285 

Pro Ala Gly Gly Ala Pro Pro Pro Asp Ala Pro Pro Asp Ala Thr Phe 
290 295 300 

5 Phe Gly Glu Val Glu Ala Arg Leu Ala Gly Gly Asp Asn Val Cys He 
305 310. 315 320 

Phe Leu Ser Thr Val Ser Phe Ala Glu Val Val Ala Arg Phe Cys Arg 

325 . 330 335 

Gin Phe Thr Asp Arg Val Leu Leu Leu His Ser Leu Thr Pro Pro Gly 
10 340 345 350 

Glu Val Thr Thr Trp Gly Arg Tyr Arg Val Val He Tyr Thr Thr Val 

355 360 365 

Val Thr Val Gly Leu Ser Phe Asp Pro Pro His Phe Asp Ser Met Phe 
370 375 380 

15 Ala Tyr Val Lys Pro Met Asn Tyr Gly Pro Asp Met Val Ser Val Tyr 
385 390 395 400 

Gin Ser Leu Gly Arg Val Arg Thr Leu Arg Lys Gly Glu Leu Leu He 

405 410 415 

Tyr Met Asp Gly Ser Gly Ala Arg Ser Glu Pro Val Phe Thr Pro Met 
20 420 425 430 

Leu Leu Asn His Val Val Ser Ala Ser Gly Gin Trp Pro Ala Gin Phe 

435 440 445 

Ser Gin Val Thr Asn Leu Leu Cys Arg Arg Phe Lys Gly Arg Cys Asp 
450 455 460 

25 Ala Ser His Ala Asp Ala Ala Gin Arg Ser Arg He Tyr Ser Lys Phe 
465 470 475 480 

Arg Tyr Lys His Tyr Phe Glu Arg Cys Thr Leu Ala Cys Leu Ala Asp 

485 490 495 

Ser Leu Asn He Leu His Met Leu Leu Thr Leu Asn Cys Met His Val 
30 500 505 510 

Arg Phe Trp Gly His Asp Ala Ala Leu Thr Pro Arg Asn Phe Cys Leu 

515 520 525 

Phe Leu Arg Gly He His Phe Asp Ala Leu Arg Ala Gin Arg Asp Leu 
530 535 540 

35 Arg Glu Leu Arg Cys Gin Asp Pro Asp Thr Ser Leu Ser Ala Gin Ala 
545 550 555 560 

Ala Glu Thr Glu Glu Val Gly Leu Phe Val Glu Lys Tyr Leu Arg Pro 

565 570 575 

Asp Val Ala Pro Ala Glu Val Val Met Arg Gin Ser Leu Val Gly Arg 
40 580 585 590 

Thr Arg Phe He Tyr Leu Val Leu Leu Glu Ala Cys Leu Arg Val Pro 

595 600 605 

Met Ala Ala His Ser Ser Ala He Phe Arg Arg Leu Tyr Asp His Tyr 
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610 615 620 

Ala Thr Gly Val lie Pro Thr lie Asn Ala Ala Gly Glu Leu Glu Leu 
625 630 635 640 

Val His Pro Thr Leu Asn Val Ala Pro Val Trp Glu Leu Phe Arg Leu 
5 645 650 655 

Cys Ser Thr Met Ala Ala Cys Leu Gin Trp Asp Ser Met Ala Gly Gly 

660 665 670 

Ser Gly Arg Thr Phe Ser Pro Glu Asp Val Leu Glu Leu Leu Asn Pro 
675 680 685 

10 His Tyr Asp Arg Tyr Met Gin Leu Val Phe Glu Leu Gly His Cys Asn 
690 695 700 

Val Thr Asp Gly Pro Leu Leu Ser Glu Asp Ala Val Lys Arg Val Ala 
705 710 715 720 

Asp Ala Leu Ser Gly Cys Pro Pro Arg Gly Ser Val Ser Glu Thr Glu 
15 725 730 735 

His Ala Leu Ser Leu Phe Lys lie lie Trp Gly Glu Leu Phe Gly Val 

740 745 750 

Gin Leu Ala Lys Ser Thr Gin Thr Phe Pro Gly Ala Gly Arg Val Lys 
755 760 765 

20 Asn Leu Thr Lys Arg Ala lie Val Glu Leu Leu Asp Ala His Arg lie 
770 775 780 

Asp His Ser Ala Cys Arg Thr Gin Leu Tyr Ala Leu Leu Met Ala His 
785 790 795 800 

Lys Arg Glu Phe Ala Gly Ala Arg Phe Lys Leu Arg Ala Pro Ala Trp 
25 805 810 815 

Gly Arg Cys Leu Arg Thr His Ala Ser Gly Ala Gin Pro Asn Thr Asp 

820 825 830 

lie lie Ala Ala Leu Ser Glu Leu Pro Thr Glu Ala Trp Pro Met Met 
835 840 845 

30 Gin Gly Ala Val Asn Phe Ser Thr Leu 
850 855 



(2) INFORMATION FOR SEQ ID NO: 28: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
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Val Tyr Cys Ser His Ser Ser Ser Pro Met Gly Arg Arg Ala Pro Arg 

1 5 10 15 

Gly Ser Pro Glu Ala Ala Pro Gly Ala Asp Val Ala Pro Gly Ala Arg 
5 20 25 30 

Ala Ala Trp Trp Val Trp Cys Val Gin Val Ala Thr Phe He Val Ser 

35 40 45 

Ala He Cys Val Val Gly Leu Leu Val Leu Ala Ser Val Phe Arg Asp 
50 55 60 

10 Arg Phe Pro Cys Leu Tyr Ala Pro Ala Thr Ser Tyr Ala Glu Ala Asn 
65 70 75 80 

Ala Thr Val Glu Val. Arg Gly Gly Val Ala Val Pro Leu Arg Leu Asp 

85 90 95 

Thr Gin Ser Leu Leu Ala Thr Tyr Ala He Thr Ser Thr Leu Leu Leu 
15 100 105 110 

Ala Ala Ala Val Tyr Ala Ala Val Gly Ala Val Thr Ser Arg Tyr Glu 

115 120 125 

Arg Ala Leu Asp Ala Ala Arg Arg Leu Ala Ala Ala Arg Met Ala Met 
130 135 140 

20 Pro His Ala Thr Leu He Ala Gly Asn Val Cys Ala Trp Leu Leu Gin 
145 150 155 160 

lie Thr Val Leu Leu Leu Ala His Arg lie Ser Gin Leu Ala His Leu 

165 170 175 

lie Tyr Val Leu His Phe Ala Cys Leu Val Tyr Leu Ala Ala His Phe 
25 180 185 190 

Cys Thr Arg Gly Val Leu Ser Gly Thr Tyr Leu Arg Gin Val His Gly 

195 200 . 205 

Leu He Asp Pro Ala Pro Thr His His Arg He Val Gly Pro Val Arg 
210 215 220 

30 Ala Val Met Thr Asn Ala Leu Leu Leu Gly Thr Leu Leu Cys Thr Ala 
225 230 235 240 

Ala Ala Ala Val Ser Leu Asn Thr lie Ala Ala Leu Asn Phe Asn Phe 

245 250 255 

Ser Ala Pro Ser Met Leu He Cys Leu Thr Thr Leu Phe Ala Leu Leu 
35 260 265 270 

Val Val Ser Leu Leu Leu Val Val Glu Gly Val Leu Cys His Tyr Val 

275 280 285 

Arg Val Leu Val Gly Pro His Leu Gly Ala He Ala Ala Thr Gly lie 
290 295 300 

40 Val Gly Leu Ala Cys Glu His Tyr His Thr Gly Gly Tyr Tyr Val Val 
305 310 315 320 

Glu Gin Gin Trp Pro Gly Ala Gin Thr Gly Val Arg Val Val Ala Ala 
325 330 335 
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Phe Ala Met Ala Val Leu Arg Cys Thr Arg Ala Tyr Leu Tyr His Arg 

340 345 350 

Arg His His Thr Lys Phe Phe Val Arg Met Arg Asp Thr Arg His Arg 
355 360. 365 

5 Ala His Ser Ala Leu Arg Arg Val Arg Ser Ser Met Arg Gly Ser Arg 
370 375 380 

Arg Gly Gly Pro Pro Gly Asp Pro Gly Tyr Ala Glu Thr Pro Tyr Ala 
385 390 395 400 

Ser Val Ser His His Ala Glu lie Asp. Arg Tyr Gly Asp Ser Asp Gly 
10 405 410 415 

Asp Pro lie Tyr Asp Glu Val Ala Pro Asp His Glu Ala Glu Leu Tyr 

420 425 430 

Ala Arg Val Gin Arg Pro Gly Pro Val Pro Asp Ala Glu Pro lie Tyr 
435 440 445 

15 Asp Thr Val Glu Gly Tyr Ala Pro Arg Ser Ala Gly Glu Pro Val Tyr 
450 455 460 

Ser Thr Val Arg Arg Trp 
465 470 

20 (2) INFORMATION FOR SEQ ID NO: 29: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 687 amino acids 

(B) TYPE: .amino acid 

25 . (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



Met Ala 
1 

Arg Asp 

35 

Leu Ala 

Pro Gly 
50 

40 Pro Asp 
65 

Glu Ala 

85 



Pro Ala Asp Pro Ala 
15 

Pro Asn Ser Leu Asp 
30 

Pro Arg Pro Thr Ser 
45 

Pro Pro Arg Gly Gin 
60 

Pro Asp Ala Leu Ser 
80 

Pro Leu Ser Pro Gly 
90 95 

215 



Ala Ala Ala Thr Pro Gly Ala Lys Arg 

5 10 
Pro Asp Ser Pro Pro Lys Arg Pro Arg 

20 25 
Thr Val Phe Gly Pro Arg Pro Ala Pro 
35 40 
Ala Pro Gly Ser His Trp Pro Gin Ser 
55 

Gly Gly Ala Pro Gly Glu Lys Ala Arg 

70 75 
Ser Ser Gly Pro Pro Thr Pro Asp He 
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Gly Ala His Ala lie Asp Pro Asp Cys Ser Pro Gly Pro Pro Asp Pro 

100 105 110 

Asp Pro Met Trp Ser Ala Ser Ala lie Pro Asn Ala Leu Pro Pro His 
115 120 125 

5 He Leu Ala Glu Thr Phe Glu Arg His Leu Arg Gly Leu Leu Arg Gly 
130 135 140 

Val Arg Ser Pro Leu Ala He Gly Pro Leu Trp Ala Arg Leu Asp Tyr 
145 150 155 160 

Leu Cys Ser Leu Val Val Ser Leu Glu Ala Ala Gly Met Val Asp Arg 
10 165 170 175 

Gly Leu Gly Arg His Leu Trp Arg Leu Thr Arg Arg Ala Pro Pro Ser 

180 185 190 

Ala Ala Glu Ala Val Ala Pro Arg Pro Leu Met Gly Phe Tyr Glu Ala 
195 200 205 

15 Ala Thr Gin Asn Gin Ala Asp Cys Gin Leu Trp Ala Leu Leu Arg Arg 
210 215 220 

Gly Leu Thr Thr Ala Ser Thr Leu Arg Trp Gly Ala Gin Gly Pro Cys 
225 230 235 240 

Phe Ser Ser Gin Trp Leu Thr His Asn Ala Ser Leu Arg Leu Asp Ala 
20 245 250 255 

Gin Ser Ser Ala Val Met Phe Gly Arg Val Asn Glu Pro Thr Ala Arg 

260 265 270 

Asn Leu Leu Phe Arg Tyr Cys Val Gly Arg Ala Asp Ala Gly Val Asn 
275 280 285 

25 Asp Asp Ala Asp Ala Gly Arg Phe Val Phe His Gin Pro Gly Asp Leu 
290 295 300 

Ala Glu Glu Asn Val His Ala Cys Gly Val Leu Met Asp Gly His Thr 
305 310 315 320 

Gly Met Val Gly Ala Ser Leu Asp He Leu Val Cys Pro Arg Asp Pro 
30 325 330 335 

His Gly Tyr Leu Ala Pro Ala Pro Gin Thr Pro Leu Ala Phe Tyr Glu 

340 345 350 

Val Lys Cys Arg Ala Lys Tyr Ala Phe Asp Pro Ala Asp Pro Gly Ala 
355 360 365 

35 Pro Ala Ala Ser Ala Tyr Glu Asp Leu Met Ala Arg Arg Ser Pro Glu 
370 375 380 

Ala Phe Arg Ala Phe He Arg Ser He Pro Asn Pro Gly Val Arg Tyr 
385 390 395 400 

Phe Ala Pro Gly Arg Val Pro Gly Pro Glu Glu Ala Leu Val Thr Gin 
40 405 410 415 

Asp Arg Asp Trp Leu Asp Ser Arg Ala Ala Gly Glu Lys Arg Arg Cys 

420 425 430 

Ser Ala Pro Asp Arg Ala Leu Val Glu Leu Asn Ser Gly Val Val Ser 
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435 440 445 

Glu Val Leu Leu Phe Gly Val Pro Asp Leu Glu Arg Arg Thr lie Ser 

450 455 460 

Pro Val Ala Trp Ser Ser Gly Glu Leu Val Arg Arg Glu Pro lie Phe 
5 465 470 475 480 

Ala Asn Pro Arg His Pro Asn Phe Lys Gin lie Leu Val Gin Gly Asn 

485 490 495 

Val Pro Arg Gin Pro Leu Ser Arg Leu Pro Pro Ala Thr Ala Pro Gly 
500 505 510 

10 Asp Val Pro Arg Gin Ala Pro Arg Gly Arg Gly Gly Gly Arg Asp Val 
515 520 525 

Pro Pro Gly Gly Arg Pro Arg Ser Ala Arg Arg Ala Trp Arg Gly Pro 

530 535 540 

Arg Thr Arg Gin Gly lie Asp Pro Pro Gly Pro Gly Arg Ser Asp Arg 
15 545 550 555 560 

Pro Asp His His Pro Arg Pro Arg Arg Ala Gly Asp lie Pro Gly His 

565 570 575 

Pro Ala Lys Gin Pro Pro Gly Leu Arg Arg Tyr Ala Arg Gin Val Met 
580 585 590 

20 Gly Leu Ala Phe Ser Gly Ala Arg Pro Cys Cys Cys Arg His Asn Val 
595 600 605 

lie lie Thr Asp Gly Gly Glu Val Val Ser Leu Thr Ala His Glu Phe 

610 615 620 

Asp Val Val Asp lie Glu Ser Glu Glu Glu Gly Asn Phe Tyr Val Pro 
25 625 630 635 640 

Pro Asp Met Arg Val Val Thr Arg Ala Pro Gly Pro Gin Tyr Arg Arg 

645 650 655 

Ala Ser Asp Pro Pro Ser Arg His Thr Arg Arg Arg Asp Pro Asp Val 
660 665 670 

30 Ala Arg Pro Pro Ala Thr Leu Thr Pro Pro Leu Ser Asp Ser Glu 
675 680 685 



(2) INFORMATION FOR SEQ ID NO: 30: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
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Val Thr Phe Leu Gly Arg His Arg Ala Gly Ala Glu Glu Gly Val Thr 

15 10 15 

Phe Arg Leu Glu Asp Gly Arg Gly Ala Pro Ala Gly Arg Gly Gly Ala 
5 20 25 30 

Pro Gly Pro Ala Lys Ala Ser He Leu Pro Asp Gin Ala Val Pro He 

35 40 45 

Ala Leu He He Thr Pro Val Arg Val Glu Pro Gly He Tyr Arg Asp 
50 55 . 60 

10 He Arg Arg Asn Ser Arg Leu Ala Phe Asp Asp Thr Leu Ala Lys Leu 
65 70 75 80 

Trp Ala Ser Arg Ser Pro Gly Arg Gly Pro Ala Ala Ala Asp Thr Thr 

85 90 95 

Ser Ser Ser Pro Thr Ala Gly Arg Ser Ser Arg 
15 100 105 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 525 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Val Gly Gly Arg Arg Pro Gly Gly Arg Met Asp Glu Ser Gly Arg Gin. 
30 1 5 10 15 

Arg Pro Ala Ser His Val Ala Ala Asp He Ser Pro Gin Gly Ala His 

20 25 30 

Arg Arg Ser Phe Lys Ala Trp Leu Ala Ser Tyr He His Ser Leu Ser 
35 40 45 

35 Arg Arg Ala Ser Gly Arg Pro Ser Gly Pro Ser Pro Arg Asp Gly Ala 
50 55 60 

Val Ser Gly Ala Arg Pro Gly Ser Arg Arg Arg Ser Ser Phe Arg Glu 
65 70 75 80 

Arg Leu Arg Ala Gly Leu Ser Arg Trp Arg Val Ser Arg Ser Ser Arg 
40 85 90 95 

Arg Arg Ser Ser Pro Glu Ala Pro Gly Pro Ala Ala Lys Leu Arg Arg 

100 105 no 

Pro Pro Leu Arg Arg Ser Glu Thr Ala Met Thr Ser Pro Pro Ser Pro 
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115 120 125 

Pro Ser His lie Leu Ser Leu Ala Arg lie His Lys Leu Cys He Pro 

130 135 140 

Val Phe Ala Val Asn Pro Ala Leu Arg Tyr Thr Thr Leu Glu He Pro 
5 145 150 155 160 

Gly Ala Arg Ser Phe Gly Gly Ser Gly Gly Tyr Gly Glu Val Gin Leu 

165 170 175 

He Cys Glu His Lys Leu Ala Val Lys Thr He Arg Glu Lys Glu Trp 
180 185 190 

10 Phe Ala Val Glu Leu Val Ala Thr Leu Leu Val Gly Glu Cys Ala Phe 
195 200 205 

Cys Gly Gly Arg Thr His Asp He Arg Gly Phe He Thr Pro Leu Gly 

210 215 220 

Phe Ser Leu Gin Gin Arg Gin He Val Phe Pro Ala Tyr Asp Met Asp 
15 225 230 235 240 

Leu Gly Lys Tyr He Gly Gin Leu Ala Ser Leu Arg Ala Thr Thr Pro 

245 250 255 

Ser Val Ala Thr Ala Leu His His Cys Phe Thr Asp Leu Ala Arg Ala 
260 265 270 

20 Val Val Phe Leu Asn Thr Arg Cys Gly He Ser His Leu Asp He Lys 
275 280 285 

Cys Ala Asn Val Leu Val Met Leu Arg Ser Asp Ala Val Ser Leu Arg 

290 295 300 

Arg Ala Val Leu Ala Asp Phe Ser Leu Val Thr Leu Asn Ser Asn Ser 
25 305 310 315 320 

Thr He Ser Arg Gly Gin Phe Cys Leu Gin Glu Pro Asp Leu Glu Ser 
325 330 335 

• • Pro Arg Gly Phe Gly Met Pro Ala Ala Leu Thr Thr Ala Asn Phe His 
340 345 350 

30 Thr Leu Val Gly His Gly Tyr Asn Gin Pro Pro Glu Leu Leu Val Lys 
355 360 365 

Tyr Leu Asn Asn Glu Arg Ala Glu Phe Asn Asn Arg Pro Leu Lys His 

370 375 380 

Asp Val Gly Leu Ala Val Asp Leu Tyr Ala Leu Gly Gin Thr Leu Leu 
35 385 390 395 400 

Glu Leu Leu Val Ser Val Tyr Val Ala Pro Ser Leu Gly Val Pro Val 

405 410 415 

Thr Arg Val Pro Gly Tyr Gin Tyr Phe Asn Asn Gin Leu Ser Pro Asp 
420 425 430 

40 Phe Ala Val Leu Ala Tyr Arg Cys Val Leu His Pro Ala Leu Phe Val 
435 440 445 

Asn Ser Ala Glu Thr Asn Thr His Gly Leu Ala Tyr Asp Val Pro Glu 
450 455 460 
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Gly lie Arg Arg His Leu Arg Asn Pro Lys He Arg Arg Ala Phe Thr 
465 470 .475 480 

Glu Gin Cys He Asn Tyr Gin Arg Thr His Lys Ala Val Leu Ser Ser 
485 490 495 

5 Val Ser Leu Pro Pro Glu Leu Arg Pro Leu Leu Val Leu Val Ser Arg 
500 505 510 

Leu Cys His Ala Asn Pro Ala Ala Arg His Ser Leu Ser 
515 520 525 

10 <2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 79 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Ala Val Ser Asp Leu Arg Arg Gly Gly Arg Leu Ser Leu Ala Ala 

1 5 10 15 

Gly Pro Gly Ala Ser Gly Asp Glu Arg Arg Arg Asp Glu Arg Leu Thr 
25 20 25 30 

Arg His Arg Asp Ser Pro Ala Arg Ser Arg Ser Arg Lys Leu Asp Arg 

35 40 45 

Arg Arg Asp Pro Gly Arg Ala Pro Glu Thr Ala Pro Ser Arg Gly Glu 
50 55 60 

30 Gly Pro Leu Gly Arg Pro Asp Ala Arg Arg Leu Arg Glu Cys Met 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 33 : 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
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Met Ser Arg Asp Ala Ser His Ala Ala Leu Arg Arg Arg Leu Ala Glu 

1 5 10 15 

Thr His Leu Arg Ala Glu Val Tyr Arg Asp Gin Thr Leu Gin Leu His 
5 20 25 30 

Arg Glu Gly Val Ser Thr Gin Asp Pro Arg Phe. Val Gly Ala Phe Met 

35 40 45 

Ala Ala Lys Ala Ala His Leu Glu Leu Glu Ala Arg Leu Lys Ser Arg 
50 55 60 

10 Ala Arg Leu Glu Met Met Arg Gin Arg Ala Thr Cys Val Lys lie Arg 
65 70 75 80 

Val Glu Glu Gin Ala Ala Arg Arg Asp Phe Leu Thr Ala His Arg Arg 

85 90 95 

Tyr Leu Asp Pro Ala Leu Ser Leu Asp Ala Ala Asp Asp Arg Leu Ala 
15 100 105 no 

Asp Gin Glu Glu Gin Leu Glu Glu Ala Ala Ala Asn Ala Ser Leu Trp 

115 120 125 

Gly Asp Gly Asp Leu Ala Asp Gly Trp Met Ser Pro Gly Asp Ser Asp 
130 135 140 

20 Leu Leu Val Met Trp Gin Leu Thr Ser Ala Pro Lys Val His Thr Asp 
145 150 155 160 

Ala Pro Ser Arg Pro Gly Ser Arg Pro Thr Tyr Thr Pro Ser Ala Ala 

165 170 175 

Gly Arg Pro Asp Ala Gin Ala Ala Pro Pro Pro Glu Thr Ala Pro Ser 
25 180 185 190 

Pro Glu Pro Ala Pro Gly Pro Ala Ala Asp Pro Ala Ser Gly Ser Gly 

195 200 205 

Phe Ala Ar£ Asp Cys Pro Asp Gly Glu 
210 215 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 493 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



Val Tyr Ser Arg Pro Pro Gly Val Ala Ala Gly Ser Gly Pro Cys Thr 
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1 5 10 15 

Pro Arg Pro Gly Gly Ala Ser Arg Pro Asn Val Gly Ala Gly Pro Arg 

20 25 30 

Gly Trp Arg Leu Gly Ser Ser Arg Arg Pro Arg Ala Arg Pro Thr Ser 
5 35 40 45 

Asp Ser Phe Ala Pro Thr Pro Leu Thr Ser Ala Ala Pro Asp Ala Met 

50 55 60 

Phe Gly Gin Gin Leu Ala Ser Asp Val Gin Gin Tyr Leu Glu Arg Leu 
65 70 75 80 

10 Glu Lys Gin Arg Gin Gin Lys Val Gly Val Asp Glu Ala Ser Ala Gly 

85 90 95 

Leu Thr Leu Gly Gly Asp Ala Leu Arg Val Pro Phe Leu Asp Phe Ala 

100 105 110 

Thr Ala Thr Pro Lys Arg His Gin Thr Val Val Pro Gly Val Gly Thr 
15 115 120 125 

Leu His Asp Cys Cys Glu His Ser Pro Leu Phe Ser Ala Val Ala Arg 

130 135 140 

Arg Leu Leu Phe Asn Ser Leu Val Pro Ala Gin Leu Arg Gly Arg Asp 
145 150 155 160 

20 Phe Gly Gly Asp His Thr Ala Lys Leu Glu Phe Leu Ala Pro Glu Leu 

165 170 175 

Val Arg Ala Val Ala Arg Leu Arg Phe Arg Glu Cys Ala Pro Glu Asp 

180 185 190 

Ala Val Pro Gin Arg Asn Ala Tyr Tyr Ser Val Leu Asn Thr Phe Gin 
25 195 200 205 

Ala Leu His Arg Ser Glu Ala Phe Arg Gin Leu Val His Phe Val Arg 

210 215 220 

Asp Phe Ala Gin Leu Leu Lys Thr Ser Phe Arg Ala Ser Ser Leu Ala 
225 230 235 240 

30 Glu Asn Thr Gly Pro Pro Lys Lys Arg Ala Lys Val Asp Val Ala Thr 

245 250 255 

His Gly Gin Thr Tyr Gly Thr Leu Glu Leu Phe Gin Lys Met He Leu 

260 265 .270 

Met His Ala Thr Tyr Phe Leu Ala Ala Val Leu Leu Gly Asp His Ala 
35 275 280 285 

Glu Gin Val Asn Thr Phe Leu Arg Leu Val Phe Glu He Pro Leu Phe 

290 295 300 

Ser Asp Thr Ala Val Arg His Phe Arg Gin Arg Ala Thr Val Phe Leu 
305 310 315 320 

40 Val Pro Arg Arg His Gly Lys Thr Trp Phe Leu Val Pro Leu He Ala 

325 330 335 

Leu Ser Leu Ala Ser Phe Arg Gly He Lys lie Gly Tyr Thr Ala His 
340 345 350 
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He Arg Lys Ala Thr Glu Pro Val Phe Asp Glu He Asp Ala Cys Leu 

355 360 365 

Arg Gly Trp Phe Gly Ser Ser Arg Val Asp His Val Lys Gly Glu Thr 
370 375 380 

5 He Ser Phe Ser Phe Pro Asp Gly Ser Arg Ser Thr He Val Phe Ala 
385 390 395 400 

Ser Ser His Asn Thr Asn Val Ser Thr Pro Ser Ser Arg Gly Ala Cys 

405 410 415 

Phe Pro Gly Ala Ala Leu Pro Glu He Asp Arg Gin Thr Asn Thr Ala 
10 420 425 430 

Arg Arg Glu Cys Gly Thr Trp Gin Pro Pro Pro Pro Trp Arg Gly Glu 

435 440 445 

Ala Leu Leu Phe He Cys Asn Arg Thr Met Arg Leu Trp Pro Arg Pro 
450 455 460 

15 Ala Arg Pro Arg Gly Ser Ser Leu Gin Thr Gly Gly Trp Tyr Thr Met 
465 470 475 480 

Thr Glu Arg Arg Gly Ala Thr Arg Arg Trp Ser Gly Gly 
485 490 



20 (2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 399 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



Val Phe Leu Phe His Arg Ser Pro Thr Pro Pro Pro Lys Ser Tyr Thr 

15 10 15 

Arg Trp Pro Leu Cys Phe Trp Cys Val Ser Gly Pro Phe Pro Thr Thr 
35 20 25 30 

Asn Met Ala Gin Arg Ala Val Trp Arg Pro Gin Gly Thr Pro Gly Pro 

35 40 45 

Pro Gly Ala Ala Ala Pro Pro Gly His Arg Gly Ala Pro Pro Asp Ala 
50 55 60 

40 Arg Ala Pro Asp Pro Gly Pro Glu Ala Asp Leu Val Ala Arg He Ala 
65 70 75 80 

Asn Ser Val Phe Val Trp Arg Val Val Arg Gly Asp Glu Arg Leu Lys 
85 90 95 
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lie Phe Arg Cys Leu Thr Val Leu Thr Glu Pro Leu Cys Gin Val Pro 

100 105 110 

Asp Pro Asp Pro Glu Arg Ala Leu Phe Cys Glu lie Phe Leu Tyr Leu 
115 120 125 

5 Trp Lys Ala Leu Arg Leu Pro Ser Asn Thr Phe Phe Ala lie Phe Phe 
130 135 140 

Phe Asn Arg Glu Arg Arg Tyr Cys Ala Thr Val His Leu Arg Ser Val 
145 150 155 160 

Thr His Pro Arg Thr Pro Leu Leu Cys Thr Leu Ala Phe Gly His Leu 
10 165 170 175 

Glu Ala Asp Pro Glu Glu Thr Pro Asp Pro Ala Ala Glu Gin Leu Ala 

180 185 190 

Asp Glu Pro Val Ala His Glu Leu Asp Gly Ala Tyr Leu Val Pro Thr 
195 200 205 

15 Glu Pro Pro Pro Asn Pro Gly Ala Cys Cys Ala Leu Gly Pro Gly Ala 
210 215 220 

Trp Trp His Leu Pro Gly Gly Arg He Tyr Cys Trp Ala Met Asp Asp 
225 230 235 240 

Asp Leu Gly Ser Leu Cys Pro Pro Gly Ser Arg Ala Arg His Leu Gly 
20 245 250 255 

Trp Leu Leu . Ser Arg He Thr Asp Pro Pro Gly Gly Gly Gly Ala Cys 

260 265 270 

Ala Pro Thr Ala His He Asp Ser Ala Asn Ala Leu Trp Arg Ala Pro 
275 280 285 

25 Ala Val Ala Glu Ala Cys Pro Cys Val Ala Pro Cys Met Trp Ser Asn 
290 295 300 

Met Ala Gin Arg Thr Leu Ala Val Arg Gly Asp Ala Ser Leu Cys Gin 
305 310 315 320 

Leu Leu Phe Gly His Pro Val Asp Ala Val He Leu Arg Gin Ala Thr 
30 325 330 335 

Arg Arg Pro Arg He Thr Ala His Leu His Glu Val Val Val Gly Arg 

340 345 350 

Asp Gly Ala Glu Ser Val He Arg Pro Thr Ser Ala Gly Trp Arg Leu 
355 360 365 

35 Cys Val Leu Ser Ser Tyr Thr Ser Arg Leu Phe Ala Thr Ser Cys Pro 
370 375 380 

Ala Val Ala Arg Ala Val Ala Arg Ala Ser Ser Ser Asp Tyr Lys 
385 390 395 



40 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 452 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Phe Leu Thr Gly Tyr Phe Arg Val His Gly lie Asp Lys Leu Asp Gin 
10 1 5 10 15 

Arg Ala Val Gin Asp Val Thr Arg Arg His Pro Val Arg Ala Arg Pro 

20 25 30 

Lys His Ala Ala Ser Gly Val Xaa Ser Gly Leu Arg Gin Gly Ala Leu 
35 40 45 

15 Val His Xaa Ala Val Ser Gly Gly Ala Leu Gly Ala Ser Asp Ala Glu 
50 55 60 

Ala Val Leu Ala Gly Leu Glu Pro Pro Gly Gly Gly Arg Phe Ala Thr 
65 70 75 80 

Pro Gly Gly Pro Arg Ala Ala Gly Asp Asp Val Leu Asn Asp Val Leu 
20 85 90 95 

Thr Leu Val Pro Gly Thr Ala Lys Pro Arg Ser Leu Val Glu Trp Leu 

100 105 110 

Asp Arg Gly Trp Glu Pro Leu Ala Gly Gly Asp Arg Pro Asp Trp Leu 
115 120 125 

25 Trp Ser Arg Arg Ser He Ser Val Val Leu Arg His His Tyr Gly Thr 
130 135 140 

Lys Gin Arg Phe Val Val Val Ser Tyr Lys Asn Ser Val Ala Trp Gly 
145 150 155 160 

Gly Arg Arg Trp Pro Leu Leu Ser Ser Tyr Leu Ala Thr Ala Leu Thr 
30 165 170 175 

Glu Ala Cys Ala Ala Glu Arg Val Val Arg Pro His Gin Leu Ser Pro 

180 185 190 

Ala Ala Gin Thr Ala Leu Leu Arg Arg Phe Pro Ala Leu Glu Gly Pro 
195 200 205 

35 Leu Arg His Pro Arg Pro Val Leu Gin Pro Phe Asp He Ala Ala Glu 
210 215 220 

Val Ala Phe Val Ala Arg He Gin He Ala Cys Leu Arg Ala Leu Gly 
225 230 235 240 

His Ser He Arg Ala Ala Leu Gin Gly Gly Pro Arg He Phe Gin Arg 
40 245 250 255 

Leu Arg Tyr Asp Phe Gly Pro His Gin Ser Glu Trp Leu Gly Glu Val 

260 265 270 

Thr Arg Arg Phe Pro Val Leu Leu Glu Asn Leu Met Arg Ala Leu Glu 
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275 280 285 

Gly Thr Ala Pro Asp Ala Phe Phe His Thr Ala Tyr Ala Val Leu Ala 

290 295 300 

His Leu Gly Gly Gin Gly Gly Arg Gly Arg Arg Arg Arg Leu Val Pro 
5 305 310 315 320 

Leu Ser Asp Asp lie Pro Ala Arg Phe Ala Asp Ser Asp Ala His Tyr 

325 330 335 

Ala Phe Asp Tyr Tyr Ser Thr Ser Gly Asp Thr Leu Arg Leu Thr Asn 
340 345 350 

10 Arg Pro lie Ala Val Val He Asp Gly Asp Val Asn Gly Arg Glu Gin 
355 360 365 

Ser Lys Cys Arg Phe Met Glu Gly Ser Pro Ser Thr Ala Pro His Arg 

370 375 380 

Val Cys Glu Gin Tyr Leu Pro Gly Glu Ser Tyr Ala Tyr Leu Cys Leu 
15 385 390 395 400 

Gly Phe Asn Arg Arg Leu Cys Gly Leu Val Val Phe Pro Gly Gly Phe 

405 410 415 

Ala Phe Thr lie Asn Thr Ala Ala Tyr Leu Ser Leu Ala Asp Pro Val 
420 425 430 

20 Ala Arg Ala Val Gly Leu Arg Phe Cys Arg Gly Ala Gly Thr Gly Pro 
435 440 445 

Gly Leu Val Arg 
450 

25 (2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26339 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 



GGGAGCGAGG TGAGGGAATG AGAGAAGAAA GGAGGAGGAT AGTAAGTGGG GAGATGAAGA 60 

AGAAAATAGA CGCTGGGAAG GAGAATACGA CGAGGAGGGA GGAAGGAGAA GAGAGTCGGA 120 

AAGCATAAGA GGTGGAAAGG AGGTGTGAGT AATTGAGCGG AGATGAGAGG ATAGGATAAG 180 

GCGAGAGACG GAAGATAAGA AAGTGAAGGG AGTAAGGGTA AGGTGAGAGA GAGAAAGAGG 240 

40 AGGATTATGT GATGGTTAGG GGAGAGAGAG GAATATGTGG AGAAATTGTG AGGAAGGAAA 300 

AGAGAGAGAA GAGTGTGGGT ATAAAGGAGA TATGGATGGA ATAGAGTAAA TTGAAGAAGG 360 

GAAGAGATGG TAAGAATGGA GTGAAGAGGA GGTAAAGAAT TTAGTAAAGG AGTGGTGATG 420 

ATGGATAAAA AAGTGGAAAT GGGGTAAGAA AGAAGAGAGA GGAGGAGGAA AAAAAAAAAA 480 
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TAAAAAAAGG GCCTTGCCGC CGGCCCTGGC GTACGCGCTA TATAAGCCCA TGCGGTATTG 540 

GATGAGTTCC CGCGCGCCCC GGAACTCCTC CACCGCCCAC GGGGCCAGGT CCGCGGCCGC 600 

CGCGTCAAAC TCCGCCAGCA GGCGCCCCAG GGCGTCAAAG TTCATCTCCC AGGGCACCCT 660 

GCGCACCACC TCATCCCGCA GCCGGGCGCA CAGGGCGGTG TGCTTGGTGA CGCGCGCGCC 720 

5 CAGCTCCTCC ACGGCCTCCG CGCGCTCGGC GCCCTTGGCG CCCAGGACGC CCTGGTACCT 780 

GGCGGAAAGG CGCTCGTAGG CCGGCTGGGC CCGCAGCCCC GACACCGTGT TGGTGGTGTC 840 

CTGCAGGGCG CGCAGCTGCT CGTGCATGGC GCGGAACCCC TCGGGGGACT TCCAGGCGCC 900 

CCCCCGGACG CGGCCAAAGC GACCCCAGAC CTCGTCCCAC TCCGCCTCGG CCTCCTCCAG 960 

GGACCTCCGC AGGGCGTCGA CGCGGCGCCG AGTATCAAAG AGCGCCCCCA GGCGGCCGGC 1020 

10 GTGCCGCGCC AGGGGGCCGG GGCCGTCGCC GCGGGCGGCG CTTAGCGGGT GCGTCTCGAA 1080 

GGTGCGCTGG GCGTGCTCTA GCCAGATAAC CGCGGGCACG TCGAGCTCGC GCGTTTTCTC 1140 

GGTCTGATCC AACAGAACCT CGACCTGGTC GGCGATCTCC GCCACCGAGC GCGCCTGGTC 1200 

GAGCGTCTTG GCCACGGTCG CCGGGACGGC GACCACCTTC AGCATGGTCT TGAGGTTGGC 1260 

CAGGCCCTCG GCCTCGATCT GGGCCCGGCG CTCGCGCGCG GCCAGCGCCT CCCGCAGGCC 1320 

15 CGCCATGACC CGCTCGGTGG CCTCCGCGCG CTGCTGTTTG GCGCGCACCA CTGCGTCCTT 1380 

GGTCTCGGCC GTGTCCTGCC GGGTCACGAA GGCGACATAC TCGGCGTACG CCGTGTTCTT 1440 

CACGGGGCTC TGGTCCACGC GCTCCAACGC CGCCGCGCAC GCGACCAGCG CGTCCTCGCT 1500 

GGGACACGGC AGGGTGACCC CGGTCCGGAC CAGCTCCGCG GTGGCCTCCG GGTCATTCCG 1560 

GGCCGCGGAT ATCTGCTCCG CGGCGGCCGC CAGGTCCAGG GGCACGCCGC CGAGCGCCCG 1620 

20 GTGCACGTCG GCCCGGATGG CGTCCAGGCG ATCGCGGAGC TCCACGTAGT CGGCGTAGCC 1680 

ATGTTGGAAG AACGGCACGT ACCGGCGCAG GCCGGGCACG CTCGTCATGT CGTCCGCCAG 1740 

GCGCCCCACG GCCTCGTGGT AGTCGATAAA CCCGTCGCCC GCCTGGGCCA TTTCCAGGAG 1800 

CCCCTCCGCG ATGCGCAGCA GCCGCGCCAG GGGCTCGGCG TCGACCCGAA ACATGTCGGC 1860 

GTAGGTTTCG GCGGCGGCGT GGAACGCCGC GCTCCAGCCG AGGCGGTGGA TGGCGGCGAG 1920 

25 CGGGGGGAGC ATGGGGTGGC GCTGGTTCTC GGGGGTGTAG GGGTTAAACG CGAAGGCCGT 1980 

ATCCAGGGCG AGGGTGACCG CCTCGGCGTT GGCCGCGAGC GCCTGCTCGG CGCGCTTGCG 2040 

GAAGTCCCGG GGGTTGTAGC CGTGCGTGCC CGCCAGCGCC TGCAGGCGGC GCAGCTCGAC 2100 

CACGTCGAAC TCGGCGCGGT TCTCGACGCG GTCCAGCGCC GCCTCGACGC CGGCGGCCCA ' 2160 

GCGCTCGCTG CTGCCCCGGG CGCGCTGGGC CGCCATCTTC CCCGTCAGGT CGGCGACGGC 2220 

30 GGCCTCAAGT TCCTCGGCGC GGCGTCGCGT GGCGCCGATG ACCTTGCCCA GCTCCTGCAG 2280 

GGCGCGCCCG CTGGGGGAAT GGTCCCCGGC CGTCCCTTCG GCGTGCAGCA GGCCCCCGAA 2340 

CCCAGCCTCG TGCCCCGCGA GGCTTTCCCG AGCAGCGGTG GTCGCGCGGG CCGCGGCATC 2400 

GATGAGGGCG GCATGGTCCC CCTCCGGCTG GGCGCAGGCC CGGCGCGCAT GGAATACCAG 2460 

GTCGGCGGCC GCCGACCCCA GGGTGGTGAG CTTGTCGATG GCCCCCCGCG CCTCCAGGGC 2520 

35 CAGCCGAGTC GCCTTTACAT ACCCCGCGGC GCTATCGGCC AGCGCCGCGA GGAAGGACAG 2580 

GGGCGAGGCC GGGTCGCGGG CGGCCGCGCC CAGGGCCGAC ACCGCGTCCG CCAGGGCGCC 2640 

ATGCGCCCGC ACGGCCGCGT CCACCGTCGC CGCGGGACTT GCCGTCGCGA CGGCGGCGCT 2700 

CCCGGCGTTG ATGGCGTTTG ACACGGCTTT GGCGATTGTG GGGGCGTGAT CGGAAAAGAA 2760 

CTGCACGAGG ACCGGCGTCT CGGGGGCGTC NGCGAACATG GTCTTCAGCA CCACCACTAA 2820 

40 GGCGGGATGC ATGCCGGCCA CAACCGTCTC GGTATCCGGG GTCTGGTGTT CCANGGCCTC 2880 

CCGGTACTGC CCCATCACCC CCCACATGTC CGCCCGCAGC CCCGCCTTGA CTTCCGGGGG 2940 

GGGGCCCCCG GACGGCATCG GCCAGGTCGG TCCACCCCGC GGGGCAGGGA GGCCCGCAGG 3000 

GTCGCCAGCA CGGCCGGACA CGCCTTTAGC CCCACAAAGT CCGGGAGGGG CCGCAGGACC 3060 

227 



WO 98/20016 



PCT/US97/20016 



CCTTGGAGTT TGTGCAGGAA CTTCTCCCGG 
GCGTCGTTGA GCATCGCCTC CAGGGCGTGG 
GGAGCGAGCT CCGCCGTCAT CTTGGCCGCC 
TCGGCCATGC GCGTGGCCTC GGGGGACAGC 
5 GTCGCCGGGA CGAAGGCCGC GTCGCTGTCC 
TCGAAGCGCT GCAGTTCGGC CAGCCCCGAG 
TGGATGCTGC GCGCCAGCTC TTCCAGGGGC 
TCGGTCAGGA CCGAGAGCCA GGCCGCCAGG 
TGGAGCAGGT CCCGCAGCAG GATGGCCTGG 

10 AGCGCGGCGC GCTGAGCGAC GTCCCGCGTG 
AACTGGACCA TGGGCACGAC CGCGGCCGAG 
CTGGCCTGCA GGGCCTTCGC GCTGTATACG 
CTTTCGATCG CCCGGCGGGC CTGGATCCGC 
GGGCCCAGGG CGGGCGGGCA CGGGGCCCTG 

15 GGCATCACGG TCAGGGGGCC CGGCGCGCTG 
GTATAAGGCC TCGCGCATCT CGCGGGCCTC 
AAAATGGGCC AGCGCCTGGA TCCGATGGAG 
GGCGAACAGG GTGTTCGGGT GGGCGCGCGA 
GTACAGATTG GCCGGCGGGG CGGCGCGCAG 

20 AAAGGCGTCC GTCTCCCGAA TAAAGTCCCT 
GCGCCCGATC TGCGAATTTT TGTCCAGCAC 
GTCCGCCAGG CTCATGCGCG TGGACGCCAG 
CATGCGCAGG GTGAAGTCCA GCAGGGCCGC 
CGTGCGGGCC CCGTTCTCGA TCAAAAAGGC 

25 GAGCTCCAAC AGCCCCGGGT GCGCCGGGTA 
CGAACACGCG GCCACCTCGC GGGCCAGGGC 
GGCGGCCACA TTGGGGTGGA CCTCGAACAG 
GGGGCGGCGG GCCCCCAGCG TCTCGAGCAC 
ATCGCCGCCT CCCTGCCCGG ACTGCGGGGG 

30 TATGGGCGTC GGGGAGGAGG CGGGGACCTC 
GGACTTCTTC TTGGCCTTGG CGGGCGGGGC 
ATCCTCCACG CTGGACGGTG GGGTCCAGGT 
ATAGCGCGCC CGGTGGCAAC CCACCGGCAC 
GGCTTCTTCG GCCGCGTCCC CGGCGGGTGT 

35 CGAGGCCGCG GCGTCCGGGG CCGAGGGCTT 
CCACACATCA TCGGGGGGGC GGTTTGGGTG 
GGCCCCCCGG GGGGCCTCGG GGGGCCGGTC 
GGGGAGCGCG GGGACGACCG GGCCCGAGCC 
AAAGAGCGCC CCTAGCCCCC CGATCTCGCC 

40 CTCGACGAAC GGTTCGTCCT GCAGGTAAGT 
GGCGGTCAGG TCCGCCGGCG CCACGGCCCC 
GCCCGCCCAC CGCACCTTGG GGCGGTCGTG 
GCCCGCCCGC ACCTTGGCGA TAAACGCGGG 



GCGTCGTGGG 


CCACCTTGGC 


GCGCTCCCGC 


3120 


GCGCGCTCCC 


GAAGCCGGGA 


GCGCGCCTCC 


3180 


TCCATGGCCC 


TCGCCTGCCG 


CAGCGCGTCT 


3240 


CCGCCCCCGT 


CGACGTACGG 


CGCGGGGCCG 


3300 


AGCTGCTGCG 


CGAGCGCCGC 


GTCGAGGGCG 


3360 


CTGCGCCGCG 


CCTGCTGGTC 


GTTGATGCCG 


3420 


TTGCGTTCGA 


TGAGCCCCTG 


GGTCGCGGCG 


3480 


TCCTCGGGGG 


CATCTAGGGT 


CTGGCCCCGC 


3540 


GGGCTGGTGG 


CGAGGGGGGG 


CGGGGGGGGG 


3600 


TGTTGGTCAA 


AGGCCGGTAG 


CGATTCCAGC 


3660 


GCCACGTGAA ACCGACAGTC 


GTGGCTGTCG 


3720 


GCTCCCCGGT 


GGAAGTACTC 


CTTGATCGCG 


3780 


ACGTCCTCCA 


GCCGCGCATG 


GATGGCCTCG 


3840 


CCGCCGGCGC 


CCCGGGGGGG 


GGGGGCAACG 


3900 


CGAGACCGAG 


TAGACCCCGC 


GGGCGAGGGC 


3960 


CGCCTCGACC 


CGCATCTTTT 


TGCCCCGGGC 


4020 


AAGCGGCTCC 


GGGTGCGTCG 


GGGTGGCGGG 


4080 


GCGCTCCAGG 


AGCCACTTTC 


CGAGGCGTGC 


4140 


CTGCAGATCC 


AGGTCCGCGA 


GGTCCCCGTA 


4200 


GGCGACCAGG 


ACCAGCTTAG 


CGAGGGCCAG 


4260 


GTGCTGGATG 


AGGGGCCGGT 


GGGCGGCCAC 


4320 


GAAGTCCCCG 


ACGGCCGTTT 


TGCGGGGCGG 


4380 


GGCCGGGCCG 


GCCACCCCGG 


CCTGCGTATG 


4440 


GAGGACGCGC 


TCAAAGAAGA 


AGATGACGCA 


4500 


CGGCGACCGC 


AGGGCGTTGA 


TGGTGAGCTG 


4560 


GGCATCGCGC 


GCCGCGAGCC 


GGACCGCCGT 


4620 


CTGCGCCAGG 


TCGGCGCCGG 


GGGGCTCCGG 


4680 


GGACGGCGAC 


GACGGGCTCG 


CGGGCCCGTC 


4740 


GGTATCCGGT 


GCGGGAGGGA 


CCGTGGCGGC 


4800 


GGCGGCGACG 


GGGGCCTTCT 


TCTTGGGCGC 


4860 


CTTGGGGGCG 


GGCCTCTCGC 


CCGAGGTCAG 


4920 


GGGCCGGCGG 


CGCTTGGGCA 


AGCCGGTAGA 


4980 


TGCCCCCACC 


TCCAGGACCC 


GCAGGTCCTC 


5040 


CTGCGGGGGC 


GGGGCGGCGT 


GCGGTGGACC 


5100 


TGCGGGCGGG 


GTCCCCTCCA 


GGGCTGCTGC 


5160 


CCCCGCCTGC 


GGTGTGTTGG 






GGCCCGAGGG 


GTCTGGACGT 


GGGTGGGCGC 


5280 


TTCTCCGTCC 


CCCCTGGGGA 


CCACACCGAC 


5340 


CCGCAGGGGG 


TGGGTGATGG 


CCACGCGCCG 


5400 


CTCGCTGGCC 


CCGTAGAGGT 


GCAGGGCCGC 


5460 


CGGGCCGGAG 


GGCACAAAAA 


ACACCATGGC 


5520 


GGCGTAATAC 


GTCAGGTACG 


GGTACACGTC 


5580 


CGTTCCCGCG 


GGCAGGCCGT 


GCGGGTCAAA 


5640 
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CAGATAGGCC GTGTCGCCGT CCCGGTAGAG CCCCATGCCC AGGGGGCCGA TGGTCAGGAG 5700 

CGTGTAGGAC AGCGGCCGCA TGGCCCAGGG GCCGGCGAAG AACGTGTGCG CGGGGCATTG 5760 

CGTCTCCAGC AGCCCCGCCG TGGGCTCCCC GAAGAAGCCC ACCTCGCCGT ACACCCGCGA 5820 

AAACACGCAA CGCAGGCCGC CGCGCGCCGC CGGGTACTCC AGGAAGTTGG GGAGCTCGAT 5880 

5 AATGGAACAC ATGCGCGGCG GCCCGGAGCC CGCGGCCGCG CGCGTCCACT CGCCCCCCTC 5940 

CACCAGACAT CCCTCAATGG CCTCCGCGGA CAGCACGTCG CGGGGCCCCA CGTCGAAAAG 6000 

AAGACTGAGA AACGACAGGG ACGAGCGCAT GCACGATACC GACCCCCCCG GCTCCAGATC 6060 

GGTCGCGAAC TGGTTCCGAA CACCGGTGAC CACGATATCG CGATCCCCCT GGGCGCTTCA 6120 

TCGTGGGGTG AGGTAGCGCG GCCGGAATCA TGTGTGCCGC GCCCGCCACG AGCGGGGCCT 6180 

10 GTTTATGGGC CGGGCGTCCC GATGAGTACT GTTGTTTCCG CCGCCCGAAC CCCCCCCGCC 6240 

CATCAACCGC CTGTTCGTCC CCCTAACCAC ACACCCGGTA TCGCGTGAGC TCGTCGTAAC 6300 

TGAACAGGAG CACGCGGGCG CAGGTCGCCC ACGGGCCCCA CGCCAGGCGC AGCGCCGCAA 63 60 

CCGTGTACGG GTCGTACACG CCTTGGGCGT CGCACGCGAC CGGCAGGGAG ACGAACAGCC 6420 

CGCCCGCGCT GGGGACGCGC GGCAGGAGGT CCGGGTGCGC CGGGATGACG GGGGCTAGGA 6480 

15 TCGCCCCCAC CGCATCCGCC GGCACGTAGG CGGCAAACGC CGAACGCCAC GGGGTGCAGT 6540 

CGCCGGGCGC GTGGGGCCGG GTCTGGGTTT CGACCCGGAA GTTCGCGGCC GCCCCGCCGT 6600 

CGGGGCGGCC GCGCACGAGG GCGGACAGCG GGACCCCCGC CGCCGCCAGG CACTCGCTGG 6660 

AGATGATGAC GTGAATCAGC GAGGCGGGGC TGCTCGGGTC CCGGGTGAGA TCGTATTGGA 6720 

CCTCGTTGGC AAAGTGCGCG TTCATGGCCC GGCCGGCGGT GCGAGCCCTT CCCGGTGCCG 6780 

20 GAAGGGGCGT GGGTGGGGGG TGCGTGTGCG CGTCCTCGGG GCCCGCGGGC GCACGTGCGC 6840 

TTATACGCTG TGTGTTTCGT CTGTCCCCAG GGAATCCGGG GCCAGGACTT TAACCTGCTT 6900 

TTCGTCGACG AGGCCAACTT TATTCGCCCG GATGCGGTCC AGACGATTAT GGGCTTTCTC 6960 

AATCAGGCCA ACTGCAAGAT CATCTTCGTC TCGTCGACCA ACACCGGGAA GGCCAGCACG 7020 

AGCTTTTTGT ACAACCTCCG CGGGGCCGCC GACGAGCTGC TCAACGTGGT CACCTATATA 7080 

25 TGCGACGACC ACATGCCGCG GGTGGTGACG CACACCAACG CCACGGCCTG TTCCTGCTAT 7140 

ATCCTGAACA AACCCGTGTT TATCACGATG GACGGCGCGG TTCGCCGGAC GGCCGATCTG 7200 

TTTCTGCCCG ACTCCTTCAT GCAGGAGATC ATCGGGGGGC AGGCCCGCGA GACCGGCGAC 7260 

GACCGGCCCG TCCTAACAAA GTCGGCGGGG GAGCGGTTTC TGCTGTACCG CCCCTCCACC 7320 

ACCACCAACA GCGGCCTGAT GGCCCCCGAG CTGTACGTGT ACGTGGACCC GGCGTTCACG 7380 

30 GCCAACACGC GCGCCTCCGG CACCGGCATC GCGGTCGTCG GGAGGTACCG CGACGATTTC 7440 

ATTATCTTCG CCCTGGAGCA CTTTTTCCTC CGCGCGCTCA CGGGATCGGC CCCCGCGGAC 7500 

ATCGCCCGCT GCGTCGTGCA CAGCCTCGCC CAGGTGCTGG CGCTGCACCC CGGGGCGTTT 7560 

CGCAGCGTTC GCGTGGCGGT CGAGGGCAAC AGCAGCCAGG ACTCGGCCGT GGCCATCGCC 7620 

ACACACGTGC ATACCGAGAT GCACCGCATC CTGGCCTCGG CGGGGGCCAA CGGCCCGGGG 7680 

35 CCCGAGCTCC TCTTCTATCA CTGCGAGCCG CCCGGCGGCG CGGTATTGTA CCCCTTCTTT 7740 

CTGCTCAACA AACAGAAGAC GCCCGCCTTC GAATACTTTA TCAAAAAGTT CAACTCCGGG 7800 

GGCGTCATGG CGTCCCAGGA GCTCGTCTCC GTGACGGTGC GCCTGCAGAC CGACCCGGTC 7860 

GAGTATCTGT CCGAGCAGCT CAACAACCTC ATCGAAACCG TCTCTCCCAA CACCGACGTC 7920 

CGCATGTACT CCGGAAAACG CAACGGTGCC GCGGACGACC TCATGGTCGC GGTCATCATG 7980 

40 GCCATTTACC TGGCGGCCCC GACCGGGATC CCCCCGGCCT TTTTTCCGAT CACGCGCACG 8040 

TCTTGAGTCT TTCTTGCCGT TTCTTTTGTT TCTCTTTCTT TCCCCCCTCT CTCCGCAATA 8100 

AACGCCTTCC CGGAACTGTG TTTTCCCCCC CTACAACAGT GTTGTCCGTT GGTTGGGTGG 8160 

TTGGGGTGCG GGGGTGGGCG GGGGAAGCAA GAAAACGGTC GGCGAACACA ACATCGGGAA 8220 
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AACGGATTCC CGAACGTGCG TCTTCCCAGA TTCGACACAC ACCCCCCTTC TCCTTAAATA 8280 

AACACAAACC ACACGCTCGT TGGTTGGTTA ATGCCGGCGC TTTATTTACG TCTTGTTTTT 8340 

TTGCGTTTCC TCCGCGGGTC CCTTCCCAAC ACGCCTGCCC CCGCCTCAGG GGTAGCGGAT 8400 

AACCGGGGCC ATGTCGCCGG ATTGCACAAC GGCGGCGCCG TCGAACGTAC ACACCCGAAC 8460 

5 CGCCGGGGCC AGGGCCAGGA TGTCCCCGAG TTGGCCCGCG TGCGCCAGCC AGGCGACCAG 8520 

CGCCTCGTAA AGCGGCAGCC TGCGCTCGCC GTCCTGCATC AGCATGGGGG CTTCGGGGTG 8580 

GATGAGCTGG GCGGCTTCTC GCGTGACGCT CTGCATCTGC AGGAGCGCGT TCACGTATCC 8640 

GTCCTGGGCG CTCAGCGCGA GCAGCCGGGG GATGAGCGTG AGGATGAGGG TGGTTCCTTC 8700 

GGTTATGGAG TAGACCATGT TGAGGACGAG CGACCGCAGC TCGGTGTTTA CGGAGGCGAG 8760 

10 TTGCTGGACG TCGGCCACGA GCGAGAGACG GGCCCCGTTG TAATACAGCA CGTTGAGGTC 8820 

GGGGAGCTCC CCGGGCGTCC GGGGGTCGGG GTTGAGGTCC CGGATGCCCC GGGCGACCAG 8880 

CCGCGCGACT ATCTCGCGGG CCAGGGGCGT TGGGAGCGGG ACCGGAAACC GCAGCGTGAG 8940 

GTCCAGCGAC TCCAGGCGCA CGTCCGTCGC CTGGCCCTCG AAGACGGGCG GGACGAGGCT 9000 

GACGGGATCC CCGTTGCAGA GGTCGACGGG GGAGGTGTTG CGGAGATTGA CGGTGCCGGC 9060 

15 GTGCGTGAGC CCCAGGTCCA CGGGGCAGGC GACGATTCGC GTGGGCAGCA CCCGCGTGAT 9120 

TACCGCGGGG AAGCGCCTGC GGTACGCCAG CAACAACCCC AACGTGTCGG GACTAACTCC 9180 

TCCGGAGACG AACGATTCGT GCGCCACGTC CGCGAGCGCC AGCTGGCGGC GGATGGTCGG 9240 

CAGAAAGACC ACTCGACCCT CGCACCGCTG CAGCGCCGCG GCATCGGGGC GCGAGATACC 9300 

CGAGGGGATC GCGATGTCTG CTTCGAAACA ATCCGTGATC ATGGCGCCGG GCCGCGAGAC 9360 

20 ACCGGAACGC GGGGGTGCGG GAGGGCCGGA AAGCGCAACG CAACCGGGAC GATGATGAAA 9420 

CAGAGATGGG GGGCACCGAC CGTGTGGGAG AGGGGGCGGG GCAGGGCTCA GCAGCACGCA 9480 

CGGGGAGGTC TGTCGTGCGC AGGAGCCCCA GGTGAGAATC AGTCCCCCGG AGCTCGGGTC 9540 

TGGGTTTTAT TGGGACCTGC CCTCGGAATC GCGGCTCCCA GTCCAAGCCC CCCCGGGGGG '9600. 

GCGGGGACAG GGGGTGTGTG TGGGTAAAAG CAACGTCGGA AAATCAAACC CAATGCCCCA 9660 

25 AACAGGAAAA AAAAAAAAGA CGGGCGGGTG GAGGGAAAGC TGGGGAAGAA GAAGCCAATT 9720 

TTACAGAGAC AGGCCCTTTA GCGGGGAGGC GTCGTAGATG AGATACTGCG TAAAGTGGGT 9780 

CTCTCGCGCG TGGGCCTCCC CATCGCGGGC GCTGCGTAGC AGGGCGGGGT CGCTGGCGCA 9840 

GGTGATCGGG TAGGCTTCCT GAAACAGGCC GCACGGGTCT TCCACGAGCT CGCGGCACCC 9900 

CGGCGGGCGC TTAAACTGCA CGTCGCTGGC AGCGGTGGCC GTGGATACCG CCGATCCCGT 9960 

30 TTCCACGATA AGACGCTCCA GGCAGCGATG TTTGGCCGTG ATGTCGGCCG CGGTGAAGAA 10020 

CTTGAAGCAG GGGCTGAGGA CGGGCGAGGC CCCGTTGAGG TGATAGGCCC CGTTGTACAG 10080 

CAGGTCCCCG TACGAGAACC GCTGCGACGC CCACGGGTTG GCCGTGGCCG CGAAGGGCCG 10140 

CGCCGGGTCG CTCTGGCCGT GGTCGTACAT GAGGGCTATG ACGTCCCCCT CCTTGTCCCC 10200 

CGCGTACACG CCGCCGGCCG CGCGTCCCCG CGGGTTGCAG GGCCGGCGAA AGTAGTTGAT 10260 

35 GTCCGTGGCC ACGGGGGTGG CGATGAACTC ACACACGGCA TCCTGCCCGT GGTCCATGCC 10320 

GGCGCGCCGC GGCACCTGGG CGCAGCCAAA GACCGGGAGG GGCTGGGCCG GCCCCAGCCG 10380 

GTTTCCCGCC ACGACCGCGT TGCGCAGGTA CACGGCGGCC GCGTTGTTTA GCAGCGGGGG 10440 

GGCCCCGCGG CCGAGGTAAA AGTTTTGGGG GAGGTTGCCC ATGTCCGTAA CGGGGTTGCG 10500 

GACGGTGGCC GTGGCCGCGA CGGCGGTGTA GCCCACACCC AGGTCCACGT TTCCGCGCGG 10560 

40 CTGGGTGAGC GTGAAGCTGA CCCCCCCGCC CGTTTCGTGG CGGGCCACCT GGAGCTGGCC 10620 

CAGAAAGTAC GCCTCCGACG CGCGCTCGGA AAACAGCACG TTTTCGGTCA CGAAGCGGTC 10680 

CTGCCGCACG ACGGTGAACC CGAACCCGGG GTGGAGGCCC GTCTTGAGCT GGTGATACAG 10740 

GGCCACGGGG CTCATCTTGA AGTACCCCGC CATGAGCGCG TAGGTCAGCG CGTTCTCCCC 10800 
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CGCCGCGCTC TCGCGGGCGT GCTGCACCAC GGGCTGGCGG ATGGAGGAGA 
CCCCAGGGCC GGGGGGACCA GGGGGACGTG GCGCGCCAGG TCGCGCAGGG 
GTTGGGCGCG TTGGCCACGT GGTCGGCGCC CGCAAACAGC GCGTGGACGG 
GAAGTATTCG CCATTTTGGA TGGTGTGGTC CAGGTGCTGG GGGGCCATGA 
5 GGCGTGCAGC GCCCCGTCAA AAATGCGCAT GTTGGCCGTC GACGCGGTGT 
GTCGGGCGCC GCGGAGCACA GCAGCGCCGT CGTGCGCTCG GCCATGTTGT 
CTGCAGCGTG AGCATGGCGG GCCCGTCAAC AACAACGCGC CCGTTGTGGA 
GACCGTGTTG GCCACCAAAT TGGCGGGATG CAGCGGGTGG GCGGGGTCGG 
GCTCGGGCAC TCCTCGCCGG GGGCGATCTC CGGGACCACC ATGTTCTGCA 

10 CACGCGGTCG AAGCGGACCC CCGCGGTGCA GCAGCGCCCC CGCGAGAAGG 
CACGTAATAG TAGATTTTGT GGTGGACGGT CCAGTCGGCC GGCCGGTGCG 
GGCGGCGTCG GCCGCGCGGG CCTGGGTGTT GTGCAGCAGC CGGCCGTCGT 
GTCGGCCGTC GCCACGTTGC ACGCCGCCGC GTAGACGGGC TCGTGTCCCC 
CCGGCAGTCT CGGTGGCGGT CCAGGGCCGC GTGTCGCATA AGGCCGTCGC 

15 GAGGGGCGGC AGCAGCGCCG GGTCGCGCAT CAGGTGATTC AGCTCGGCCT 
GCCCAGCTCC GGGCCCGGCA GGGTAAAGTC GTCCACCAGC TGGGCCAGGG 
GGCCACCAGG TCCCGATACA CGGCCATGCA CTCCTCGGGG AGGTCGCCCC 
CACGATGTAC GAGACCAGCG AGTAGTCGTT CACGAACGCC GCGCATCGCG 
GTAGCTGGTG ATGCACTGAG TCACGAGCCG CGCCAGGGCG CAGAACACGT 

20 GTGAATCGCG GCTTGCAGCA GGTAAAACAC CGCCGGGTAG CTGCGGTCCT 
GCGGACGGCG GCTATGGTAG CCGGCGCCAT GGCGTGGCGG CCAACGCCGA 
CCGGGCGTCA CGAAACGCCA CCGGACACAG CGCCAGGGGC AGGTTGCCGT 
CCAGGTGGCC TGGATCGCCC CCGGACCGGC CGGGGGGACT TCGCCGCCGG 
GTCGGCCACG CCCGCGAAGA AGTCGAACGC GGGGTGCAGC TCCAGAGCCA 

25 GTCGGGCTGC ATGAACTGCT CCGCGGTCAT CTGGCACTCG GCGACCCACC 
GTGGGCGAGG CGCTGCCGCC AGGCGTTCAG AAAACGCTGC TGCATGTCCG 
GGCCGGGGCC GCGACGTACG CCCCGTACGG ATTCGCGGCC TCGACGGGGT 
GCCCCCGACG GCCGCGTCGA TGTTCATGAG CGAAGGATGA CACACGGTCC 
CTCCATGGAC AGCCGCAGAA CCTGGTGGTC CTTTCCCCAA AAAAACAGCT 

30 GAACGCGCGG GGCTCCGGGT GGCCGGGGGC GGGCACCAGG TCCCCGGCGT 
GCGCTCCATG GCCGGGTTGA ACAGCCCCAG GGGCAGGACG AACGTCAGGT 
CACCAGGGGG TAGGGCACGT TGGTGGCGGC GTAGATGCGC TTCTCCAGGG 
GACCAGCCTG TCGCCTATGG CCACCAGATC CGCGCGCACG CGCGTTGTCT 
TTCGAGTTCA TCCAGCGTCT CCCGGTTCGC CTCGAGTTGC TCCTCCTGCA 

35 GTGGCGGCCC ACGTCGTCCA GGCTCCGCAC GGCCTTGCCC ATCACCAGCG 
GTTGGCCCCG TTCAAGACCA TCTCGCCGTA GGTCACCGGC ACGTCGGCCT 
CACCTTCAGG AAGGACTGCA GGAGGCGCTG TTTGATGGCG GCGGTGGTGA 
GTCGACCGGC CGCCCGCGCG TGTCGGCGTG CGTCAGGCGG GGCACGGCCA 
CGTCGCCGTG GTCAGGTCCA CGAGCCAGGC CTCGATGGCC TCGCGGCGAT 

40 GCCCAGGAAG AAGCTCGTGT CGCAAAAGCT CCGCTTCAGC TCGGCGACCA 
GGCGACCCTG GTCGCCAGGC GCCCGTTGTC GAGATATCGT TGCATGGGCA 
CAGGGGAGGC GCCTTCTCCA ACAGCACGTG CAGCATCTGG TCGGCCGTGC 
CGCCCCCAGG ACGGCCTGGA CGTTGCGCGC GAGCTGCTGG ATGGCGCGCA 
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CAGGCTAATG CCCGTCCCGT CCAGGGCCTC CCCCGTGAGC AGGGCAATGG CCTCGGTGGC 13440 

CAGGCTGAAG GCGGCGTTCA GGGCCCGGCG GTCGATGACC TTCGTCATGT AATTATGCAC 13500 

GGGCTGCTCG ACGGGGTGCG GGCCGTCGCG GGCGATGAGG GGCTGGTGGA CCTCGAACTG 13560 

. CACACGCCCT TCGTTCATGT AAGCCAGCTC CGGGAACTTG GTGCACACGC ACGCCACGGA 13620 

5 CAGGCCGAGC TCCAGAAAGC GCACGAGCGA CAGGGTGTTG CAGTAGGACC CCAGCAGGGC 13680 

GTCAAACTCT ACGTCATACA GGCTGTTTTC GTCGGAGCGC ACGCGGGCGA AAAAATCAAA 13740 

GAGTCTGCGG TGGGACGCCA CCTCGATCGT ACTCAGGATG GAGCCGGTGG GCACCATGGC 13800 

CGCGGCGTAC CGGTAACCCG GGGGGTCGCG GGCAGGAGCG GCCATTGGGT TCCTTGGGGG 13860 

ATTCGCAGGC TCCATCAAGC CAAGCTCGGG AAGGCCAAGC CCCTCCCACA CAACGCCTCA 13920 

10 CCGCCGGCGG ACGCGACTAA CAACCCACGG GCCGCCAAAA ACCCCAAGGG GCAACCCGAC 13980 

CAACAACAGG CGAGGGGAGG AAAGGCGTAA AGGGGGCGTT GGGAGGCAAA AAGAAAGAAA 14040 

ACACCCAGAC GTAGGCCCGA GGACCGGCCG GCGTCCTCTG TCCCCGAGCA CCCACTGTGC 14100 

CCAACAGGCA CGGGGGCGAG CTGCCCCTGC CTTATATACC CCCCCGCCAC ACCCCCGTTA 14160 

GAACGCGACG GGTGCCTTCA AGATGGCCCT GGTCCAAAAG CGTGCTAGAA AAAAGTTGGT 14220 

15 AAAGGCGGCA AAGCAGTCCG CCGCCGCCAC CCACATGGCG GCGCCGGCCG CGCAGGCGAT 14280 

TCCCAGAGAA CGGGCGCGGA GGGGATCCGT GCGGGGCAGC AGCTGGCTGG CGGTGATCCA 14340 

ATGGAAAAGC CCGTCGGGAC TGAACGTCTC ATGGGCGGCC GCCACCAGGG CGCACAGGGC 14400 

CGCGCCGCCC ATGATCACGC ACAACCCCCA AAACACGGGT GGCGACAACG GCAGGCGATC 14460 

CCGTTTGATG TTCACGTACA GGAGGAGCGC CCGTGCCAGC CACGTGACAT AGTAGGCGAG 14520 

20 GACGGCGGCT ATAATACATG CCGGCGCCAC CGCCCGTCCG GTCCACCCGT AATACATGCC 14580 

CGCGGCCACC AGCTCCAGCG GCTTGAGGAC CAGGAACGAC CAAGCAAACA TCACCACCCG 14640 

CTTGGAAAAG ACCGGCTGGG TGTGGGGCGG AAGACGCGAG TAGGCCGAAC TGACAAAAAA 14700 

ATCAGACGTG CCGTACGAGG ACAGCGAAAA CTGTTCATCG AGCGGCAGTT CGCCGTCCTC 14760 

CCCGCCACAC GCGGCCTCGT ATACCAGCTC GCGATCCAAC AAAGGAACAT CATCCCGCAT 14820 

25 TGTCATGGTC GGTGCGGGGA GCCGGCGAGG CAGCGAAACC GAAAGTAGTG CTGGCGGCGC 14880 

GGGCCCGGGT CCGGACCCAA GCTTCAGGGA TGGGGGGCGG AGGCCAAAAT CAAACAAGCA 14940 

CCGCGCGGGT TCTACACACA ACCCCCACCC GGGTAGTATC CGCGGATGCG AGTGCCTGGC 15000 

GAAGTCACGT CCCAGCAGGA TATAAACCTC GGCCGTTGGG CCCGGAACCC CCGAAATTCA 15060 

CACCCACGCC CTGACGCCCA AATCATGGGT GGATGTGGTT CGCGAGCCGC ACATCCGTGC 15120 

30 GTCCGCCCTC CCCCGCGGGC TGATGACGTG GCGGTTAGTC AGTGGGAAGG CAGGGGG AAA 15180 

GATGGGTTGG GGGAGGAAAC GAAAAAAACA CCCAGAGGGC CACGTCGGGA ATGCGCCCGG 15240 

AGTTGTCCTT AAAAGGCCGG CCGTGCGTGA CGGAAGCCGT CGTTTGCCCA AGCACCGACG 15300 

CCGCGATCCA CAGTGGGGGG AGTTCCTCCG TCCGGCCACA ACCCTACGCG CGGGCGGCAC 15360 

GCGCGAGAGC AACCCACGGG TCCCGTTCGC GCCACCGCCA GCCCTTGCTC CCACCACCCT 15420 

35 CCTCCCACCA CCCCACTATT CCCCCCCCCC CAAGTCCGCC CCGTGGCTCG CCGGCCATGG 15480 

AGCTCACCTA TGCCACCACC CTGCACCACC GGGACGTTGT GTTTTACGTC ACGGCAGACA 15540 

GAAACCGCGC CTACTTTGTG TGCGGGGGGT CCGTTTATTC CGTAGGGCGG CCTCGGGATT 15600 

CTCAGCCGGG GGAAATTGCC AAGTTTGGCC TGGTGGTCCG GGGGACAGGC CCCAAAGACC 15660 

GCATGGTCGC CAACTAGGTA CGAAGCGAGC TCCGCCAGCG CGGCCTGCGG GAAGTGCGGC 15720 

40 CCGTGGGGGA GGACGAGGTG TTCCTGGACA GCGTGTGTCT GCTAAACCCG AACGTGAGCT 15780 

CCGAGCGAGA CGTGATTAAT ACCAACGACG TTGAAGTGCT GGACGAATGC CTGGCCGAAT 15840 

ACTGCACCTC GCTGCAAACC AGCCCGGGGG TGCTGGTGAC CGGGGTGCGC GTGCGCGCGC 15900 

GAGACAGGGT CATCGAGCTA TTTGAGCACC CGGCGATCGT CAACATTTCC TCGCGCTTCG 15960 
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CGTACACCCC CTCCCCCTAC GTATTCGCCC TGGCCCAGGC GCACCTCCCC 
GCTCGCTGGA GCCCCTGGTG AGCGGCCTGT TTGACGGCAT TCCCGCCCCG 
TGGACGCCCG CGACCGGCGC ACGGATGTTG TGATCACGGG CACCCGCGCC 
TGGCCGGGAC CGGGGCCGGG GGCGCGGGGG CCAAGCGGGC CACCGTCAGC 
5 AAGTGAAGCA CATCGACCGT GTTGTGTCCC CGAGCGTCTC TTCCGCCCCC 
CCCCCGACGC GAGTCTGCCG CCCCCGGGGC TCCAGGAGGC CGCCCCGCCG 
TCAGGGAGCT GTGGTGGGTG TTCTACGCCG GCGACCGGGC GCTGGAGGAG 
AGTCGGGATT GACGCGCGAG GAGGTCCGCG CCGTGCATGG GTTCCGGGAG 
AGCTGTTTGG GTCGGTGGGG GCTCCGCGGG CGTTTCTCGG GGCCGCGCTG 

10 CGACCCAAAA GCTCGCCGTC TACTACTATC TCATCCACCG GGAGCGGCGC 
TCCCCGCGCT CGTGCGGCTC GTCGGTCGGT ACATCCAGCG CCACGGCCTG 
CGCCCGACGA ACCGACGTTG GCCGATGCCA TGAACGGGCT GTTCCGCGAC 
CCGGGACCGT GGCCGAGCAG CTCCTCATGT TCGACCTCCT CCCGCCCAAG 
TGGGGAGCGA CGCGCGGGCC GACAGCGCCG CCCTGCTGCG CTTTGTGGAC 

15 TGACCCCGGG GGGGTCCGTC TCGCCCGAGC ACGTCATGTA CCTCGGCGCG 
TGTTGTACGC CGGCCACGGA CGCCTGGCCG CGGCCACGCA TACCGCGCGC 
TGACGTCCCT GGTCCTGACC GTGGGGGACG TCGACCGGAT GTCCGCGTTT 
CGGCGGGGGC GGCTGGCCGC ACGCGAACCG CCGGGTACCT GGACGCGCTG 
GCCTGGCTCG CGCCCAGCAC GGCCAGTCTG TGTGAGATAT CCCAATAAAG 

20 TTCTAACCCA CGGATGCCGT TGTATGCCTA TACGGGGGAC TATGGGGGGG 
AAAGGAAACA GGAATGGAGA AGGGAAAGGA ACAGAGGCGG TAGCGGACGC 
CAATAACAAA CAGACCGCGG ACACGGAGGG AGTCGGTTGG GTTGGGCGTG 
CGTCCACACA CCCGTTTATT CGCGTCTCCA CAAAAATGGG ACGCACGTTC 
GAGGATGCCC GCCAGGGCCG CGGTGATCAT AACGACCCCC AGCGCGGACG 

25 CCCGGGGGCG ATGGTGGCGA TGGGCAGCGT GTCAAAGGCC AGCAGATGAA 
GTTGGGGAAC AACAACAGGG CCACGGACGG CACGTCGCTG GAAAACACGT 
CGCCACCGGC CCCTGGGCCA GCTGCTGTTG GGTGGCATCC GTGTCCACCA 
CATGACCTCC CCGGCCGGGG TGTAGCGCAG AAACACGGCC CCCACGAGGC 
CCGGTTTTCG GTGCGCACCA GCCGCTTCGG CTCAATCTCC CGCGCGTGCC 

30 GGCGGTGAGA TAGGTGATAA ACAGCGGGCG GCGGACGTCA ACGCCCGTAA 
GATCCCGCGG GGCAAGGGGG TGTGGGTGAC GACGTAGCTG GCGTTGTGGG 
GAGGATCCGG GGCTCCGCGT TGTGCGACGG GCCGCTACAC TGGTGGGTGG 
GAAGGCGCGG ATCAGGGCGT TGTAGTGCGC CCAGCGCGTG AGAACGGAGG 
GGTCTGTTGT GCCATGACGT CCGCCGGGAT GTCGGATCGG GTGGCCATGG 

35 CAGGATGAAC CCGCCCTCGG CGAGATCGAA GCGCAGGGAA GCTGCGCATG 
GTCCGGGAGC CAGAAGAGGT TTTTCTGGTG GTCGGTCCTG GCTAGCGCGG 
GGCGTGGGTC GCCGCGGCGA CGTCGGACGT ACACAGGGCC GTGGTTATGA 
GCGGGCGCGT TCCCGCTGCT CGGCCGAGGG CGCGCCCGCC AGGAACGGCG 
GGCCGTGGCG TAAAACAGCG CTCGGCGGAC CATCGGGGCG GTTAGCGCGC 

40 AAACTCGGCG TACAGGGCGT CGATCAGGCG GGCCGCGCTC GGGGCCACCG 
CGCGGGGCTG TCCAACACGA ACGCCAGCTG ATAGCCCAGC GCGTGCGCCG 
CTCTCGCTCG AGGATCGCGG CCACCAGATG CCCGAGGCGC GCCTCCAGCC 
CGCCGGGTCC AACACGGACA CGTTCAGGAA CACCGAGTCG GCCGCGCAGC 
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CCGGGCGGCC AGGCCGGCCA GCACGCGCGA GTGGGCCAAA AAGCCCAGCA GGTCGGAGAG 18600 

GCGAATCGCG TCGTGGGCGT GGGCCGCGTT GACGAACGCA AACCCCGACG AGGCGAGCAG 18660 

CCCCGCGAGG CGCCAGAACA GGGACGGACG CGCGTCCGTG CCGGAGCCCG GGTCCTCCCC 18720 

CAAAAACTCC GCATAGGCCC GCGACATATA CTGGGCGTAG TTCGTGCTCT CCTCGGGGTA 18780 

5 GCCGGCCACC CGCCGGAGGG CGTCCAGCGC CGAGCCGTTG TCGGCGGGCG TCGGGGCCCC 18840 

CAGGACAAAG ACGCGATACC TGGGGCCGGC CGGAGGCCCG GGGAGCACCG CGGGGGCGTT 18900 

TTCGTCGGTC GGATTTCCGA CCCGAGCGAG GGTCTTGTCC GCAGGCACCA CTATGATCTC 18960 

GGCCGGAGGG CTGTCCCGCA TCGATATCAC AACCCCCATG AAGCCCTTCC CGTATCGCGC 19020 

GCGCACAAGC GCGGCGTCGC ACCCGAACGC CAGCCCGCCC GTCGTCCAAA CGCCCACGGG 19080 

10 CCACTTCAAG GCCGACGGGG AGAGGTACAC TTACCGACCC GGAGTCCGTA GCAGGCCCCT 19140 

GGCGGCCAGC CAGGTCACGG ATGCGTTGTG CAGATGCGCG ATGCTCAGGT TCGTCGTCGG 19200 

ATGCCTCGGT GTCCCCGCGG GCGGCCCCGG GGGCGGCGCG TTGCGTCGGC CGTCCGGGTG 19260 

CCTCTCGGTC GCCCCGTCGT CTCCCCGCGG GAACGTAAGC CCCTCGCGGT CCGGCGCGGC 19320 

CGCGAATGTT ACCCAGGCCC GGGACCGCAA CAGCGCGGAG GCGCCGGGGT TGTGCGACAG 19380 

15 TCCCTTGAGC TGGGTCACCT CGGCGGGGGG ACGGGACGTG GGCCCCGCCT CGGGGAGCTC 19440 

GGGCAGGCTC GCGTTCCGAG GCCGGCCGAG CAGATAGGTC TTTGGGATGT AAAGCAGCTG 19500 

CCCGGGGTCC CGAGGAAACT CGGCCGTGGT GACCAACACA AAACAAAAGC GCTCGGCGTA 19560 

CCACCGAAGC ATGGGCACGG ATGCCGTAGT CAGGTTGAGT TCGCCCGGGG GCGCCAAGCG 19620 

TCCGCGCTGG GGGTCGCTGG CGTCGGGGGT GTTGGGCAAC CACAGACGCC CGGTGTTTGT 19680 

20 GTCGCGCCAG TACGTGCGGG CCAACCCCAG ACCGTGCAAA AACCACGGGT CGATTTGCTC 19740 

CGTCCAGTAC GTGTCATGGC CCCCGGCAAC GCCCACCAGG ACCCCCATCA CCACCCACAG 19800 

ACCGGGGCCC ATGGTCGTCC GTCCCGGCTG CCAGTCCGCA GATGGGGGGG TGTCCGTACC 19860 

CACGGCCCAA AGAGGCTCCG CACCTCGGAG GCTATCGGAG GCCCTTTGTT GCCGTAAGCG 19920 

CGGGCCAAAG GATGGGGTGG GGTGAGGGTA AAAGCACAAA GGGAGTACCA GACCGAAAAC 19980 

25 AAGGACGGAT CGGCCCGCTC CGTTTTTCGG TGGGGTGCTG ATACGGTGCC AGCCCTGGCC 20040 

CCGAACCCCC GCGCTTATGG ACACACCACA CGACAACAAT GCCTTTTATT CTGTTCTTTT 20100 

ATTGCCGTCA TCGCCGGGAG GCCTTCCGTT CGGGCTTCCG TGTTTGAACT AAACTCCCCC 20160 

CACCTCGCGG GCAAACGTGC GCGCCAGGTC GCGTATCTCG GCGATGGACC CGGCGGTTGT 20220 

GACGCGGGTT GGGATCATCC CGGCGGTGAG GCGCAACAGG GCGTCTCGAC ACCCGACGGG 20280 

30 CGACTGATCG TAATCCAGGA CAAATAGATG CATCGGAAGG AGGCGGTCGG CCAAGACGTC 20340 

CAAGACCCAG GCAAAAATGT GGTACAAGTC CCCGTTGGGG GCCAGCAGCT CGGGAACGCG 20400 

GAACAGGGCA AACAGCGTGT CCTCGATGCG GGGCAGAGAC CCCGCGCCGT CCTCGGGGTC 20460 

GGGGCGCGGG GTCGCCGCGG CGACCCCCGT CAGCCGGCCC CAGTCCTCCC GCCACCTCCC 20520 

GCCGCGCTGC AGGTACCGCA CCGTGTTGGC GAGTAGATCG TAGACACGGC GAATGGCGGA 20580 

35 CAGCATGGCC AGGTCAAGCC GCTCGCCCGG GCGTTGGCGT CTGGCCAGGC GGTCGGCGTG 20640 

TTCGGCCTCC GGAAGGACAC CCAGGACCAG GTTCGTGCCG GGCGCGGTCG GGGGCATGAG 20700 

GGCCACGAAC GCCAACACGG CCTGGGGGGT CATGCTTCCC ATGAGGTACC GCGCGGCCGG 20760 

GTAGCACAGC AGGGAGGCGA TAGGGTGCCG GTCGAAAACA AGGGTGAGGG CCGGGGGCGG 20820 

GGCTTGCGGG CCCACAGCCT CCCCCCCGAT ATGAGGAGCC AAAACGGCGT CCGTCGCCGC 20880 

40 ATAAGGCGTG CTCATTGTTA TCTGGGCGCT GGTCATTACC ACCGCCGCCT CCCCGGCCGA 20940 

TATCTCGCCG CGGTCCAGAC GGTGCTGCGT GTTGTAGATG TTCGTCAGGG TCTCGGAGGC 21000 

CCCCAGCACC TGCCAGTAAG TCATCGGCTC GGGGACGTAG ACGATATTGT CGCGCGGCCC 21060 

CAGGGCCTCC ATCAGCTGCG CGGAGGTGGT GGTCTTCCCC ACCCCGTGGG GTCCGTCTAT 21120 

234 



WO 98/20016 



PCT/US97/20016 



10 



15 



20 



25 



30 



35 



40 



ATAAACCCGC 
ATGGCTAGGA 
GAACGCAGGC 
GCCCGAGAGG 
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CATCGCTCGG 
GGAGGCGCTG 



AGCAGCGTGG 
CGGGACGCCG 
GCGTGCTGTT 
TGCGGGAGTT 
TGCAGGGTCG 
GGCCGCGCCG 
GTGTGTGCGT 
GCAAACGCGA 
AGTCGCTCGC 
CGCAGCGTAC 
ACATCACCGC 
CGCAGACCCG 
CCCCGACACG 
CCTCCCCAGG 
CGCTTTTTTG 
CTGTGTGTTG 
TCTGGCCAAG 
CGCCGGCGAC 
CGCCTCTCCC 
CGGAGCGCCG 
GCGCTGGACG 
CCCGAGTTCC 
GCGGAGCGGG 
GCCGCCCTCC 
CAGCAGGTGC 
GAAGAGGCGG 
GCGCCGTCGC 
GACCCGCCGC 
GGCCGCGGGG 
GAACGCACCA 
ATGTCCAAAA 
GTGGGCCAGC 
CGAACCACCC 
CTGGCCCGCC 
CGCCCGCAGT 
GGCCGCTACG 
GGGGTGCTAC 
GACGACGTGG 
CACAACCTCT 
GCCCTGGCCG 
AACCGCCTGC 
GGGGCGTCCG 
TGCGTTAACT 



GCAGCTCCGG ATCCCCGCGG 
CGCGGCCGTC GGTAGGCCCG 
GGCCGGCGTG AGAAGCCATA 
CAACGCCACC AGGATTTGTG 
CTCGCTGTTC GAGGCCACGC 
CCCAGACTGC ATCTGCGTGT 
CATCCTGGAG CTAAAGACAT 
ACAGCGGACC ACGGGCATGA 
GCCTCCGGGG GACAAGGTCG 
GCTGCGCGTC AGCCGCGTGA 
GGCCGTGCGG ATGCTCCAAA 
GCGGTCGCGG CGCCGGGTTG 
TGACCCGGAA GGCACGGCGG 
GGTTGTAGGC GTCGCTGCGG 
CGTGCCGGTG GCCGCCAAGA 
TTTTTTTTCC TCGTTTTGTT 
CATCCTCACC TGCTTAAGCG 
ACCCACCCGA CAACAGCCCC 
TTTTTTCCCC CCCTCAAAAA 
TCGTCGCCCG CCCGCCGCCC 
TTTGGGAACA CAGGCGCTTC 
CCCGGGACTT CTGGATGTTG 
CGGCAGTGAT GCAGGCCCAG 
AGGCCGCCGA GCTGCCCGTC 
ATCACATCGC CGACGCCCTG 
ATGCCGCGCG GGACGCCGAG 
CCACCGCGGG CCCCGCCGCC 
TACGATACGA TACCAACCTC 
CCGCGGGTTC GTCGGGAGTC 
TCGCGGACTT CCCCCTGACC 
CCTTCATGAC CGCGCTGGTC 
GCCACTATTC CGCCTTCGAG 
ACGAGTCCTC CCCCGATCGC 
TGCCGCGCTA CCTGGCGCGT 
ACCGCTACCG CGACGACAAG 
AGCACGGGGC CCTGGCCACC 
CGGCGGCCCC GGGCGACGTT 
CCCACCGCGA CGACGTCAAC 
TCCTGTGGGA GGACCAGACG 
TGCTTCGGCG GCTCCTCGCG 
AGCTGGGCAT GCTGATCCCG 
GATTGGACTC GGGCGCCATA 
ATGTACTTCC GCTGTATCAG 
235 



GCTTCGGAGG 

CTCGCACGAG 

CCCGCTTCTA 

GAACGCTGCT 

GCGTCACCTT 

TCGAATTCGC 

GCAAATCGAT 

AGCAGCTGCG 

TCTACCTGTG 

CCCGGCTCGT 

GCCTGTCCAC 

CCGCGACCGC 

GTCATCCGGC 

AGGGTGGGGG 

GCAGACCCCG 

TTCTCTTCTT 

GAACCCGCGG 

TGGGTGTAGA 

ACGTGGTGTT 

TCGAACATGG 

ATCGTCGCCG 

CCCGTGTTCA 

CGCACCGCGG 

GACATCGAGC 

GAGGCGCTGG 

GCGAGGGGGG 

GCGGAGATGG 

CCCGTGGATC 

GTCTTTGGTA 

ACCCGCAGCG 

CTGTCTCTGC 

TGCGCCGTGC 

GATCGCGCTC 

CTGGCCGCGG 

CTGCCCAAAG 

CACGTCGTGA 

CCCCGAGACA 

CGCGCCGCCG 

CTGCTGCGGG 

AACGGCAACG 

GGAGCCGTCC 

AAAAGCGGCG 

GCAGACCCCA 



CCCCCTGGCG 

CAGCCTGACC 

CAAGGCGTTC 

GACGCTGATG 

AATATGCGAA 

CAATGACAAA 

TTCTTCCGGG 

CCACTCCCTG 

TCCTATTTTG 

CCCGCAAAAG 

GTATGCCGTG 

CAGACCGCAA 

CCCACCAGAG 

TGTGCTTCAG 

GACCAAAACC 

TCCCCCCCCC 

GCGCGCGGGG 

CCGCTGTCGC 

GGGCGCCGGC 

ACCCGTACTA 

ACTCCAGGAG 

ACATCCCCCG 

CCGCGGCGGC 

GCCGGATACG 

AGACCGCGGC 

AGGGCGCTGC 

AGGTTCAGAT 

TGCTACACAT 

CCTGGTACCG 

CCGACTTTCG 

AGTCGTGCGG 
TGTGTCTGTA 

CCGTTGCGTT 
TAATCGGCGA 
CGCAGTTCGC 
TCGCCACGTT 
CCAGCACCCG 
CCGCGTTTTT 
CGACCGCCAA 
TGTACGCGGA 
CGGCGGAGGC 
ACAACAACCT 
CGGTCGAGCT 



21180 

21240 

21300 

21360 

21420 

21480 

21540 

21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 



WO 98/20016 

GACCCAGTTG TTTCCGGGGG CTGGCCGCCC 
TGGCGTTGAC GAGGCGCGTG GTGGATATGT 
GCCTCACCGC GCTGGAGCTC ATCAACCGCA 
TTATTAACGC CCACGATGCC TTGGGGATAC 
5 AGCAGGCACG CATCGGCTTG GCGTCGAACG 
GCGACTACGA CCTGTTGTAC TTTTTGTGTC 
CCTAGGGAAG GGTGGGGGTG GTGGTGGTGG 
CGCCTGGTCA CAAAAGGCAC GGCGCCCCGA 
GGCGGACACA CAACAACGGC GGGCCCCGTG 

10 TATTCCCTTG CCCGCTTCCA CCCCCCCTTC 
GCGTCGGCGG AAATGCGCGA GCGGTTGGAG 
TACGTGGCCG GGTTTTTGGC CCTGTACGAC 
CCAGACACGG TGCGTGCGGC CCTGCCTCCG 
CGCGCTCGGT GCGAGGTGGG CCGGGTGCTC 

15 TTTGTGGGGC TGATCGCGTG CGTGCAGCTG 
GCTATTTTTG AGCGCCGCGG ACCCGCGCTC 
ACCAACTACC TGCCATCGGT CTCGCTGTCC 
GACCGCACCC TGTTTGCGCA CGTGGCCCTG 
GTCACCTACG ACACCAGCCT AGACGCGGCC 

20 ACGCGCGAGG GGGTGCGACG CGAGGCCGCC 
TGGGCCCCCG GCGTGGAGGC GCTCACACAC 
ATGCTGCGTG ACCGCTGGAG CCTCGTGGCC 
CACACGTACC TTCAGGCGAG CGAAAAATTT 
CCGGAGCGCG GGTATAAAAC CGGCGCCCCG 

25 GTTCCCGCGC CGCAGGTCGC CGTCCGTGCG 
TCTTCTTTTC CGGCACCGGC CGATATGAAC 
CCGCCGCCCG GCGACGGGAG TTATTTGTGG 
ACCGGGCAAT CCGCGCCCCA CCACCCGCCG 
ACGGTGGCCT ACGGACACCC CGGCGCCGGC 

30 CACCCGTACC CGGGGTATGC TGTTCGCGGG 
GCTGGTGGGG GCCATCGCCG CCGACCGCCA 
CCACGGGATC CGGGGGTCGG CGAACCGCCG 
CTGCGGCCGT GACGAGCCGG ACCGGGACTT 
GCCGCGCCCG GTCGACTCCC GGCGCGCCGC 

35 CACGGCGCTG GTGGGGGCGG TGACGTCCCT 
TACCCACGCC CCCTACGGGC CGTATCCGCC 
CACGGAGACC CCCGCCCAAC CACCCCGCTA 
GGACATCGCC CCCCCGGGGC CTCCTCTATC 
AGTTGCGGTT ACCCCCGGTC CCGCCCCCCC 

40 CCCCCCTCCG CCGCCGCCGG GACCCACGCC 
GGCGCCCGGC GCGGAGGCCG GCGCCTTAGT 
GGACACGGCC CGGGCCGCCG ATTTGTTTGT 
CTCCAGGATC CGGACTTGGG GGGGGTGTGT 
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TGTGCCTGGA 


CGCCCAGGCG 


GGGCGGCCAC 


23760 




CCGCCAGGCG 


GCGCTCGTGC 


23820 


CCCGCACAAA 


CACCACCCCT 


GTGGGGGAGA 


23880 


AATACGAACA 


GGGCCTGGGG 


CTGCTCGCCC 


23940 


CCAAGCGATT 


CGCCACGTTC 


AACGTGGGCA 


24000 




TCCCCAGTAC 


CTGTCCGTGG 






TGCTGTTGTT 


GTTTCTGGTC 


941 9ft 

Z *± J. Z \J 


AACGCGGGCT 


TTAGTCCCGG CCCGGACGTC 


941 ftft 

Z*± lOU 


GGTGGGTAAG 


TTGGTTCGGG 


GGCATCGCTG 


24240 


CPGTTTTnTT 

wV,U X X X X \J X X 


TGTTTGTGCG 


GGTGCCCATG 




GCGCCTCTGC 


CCGACCGGGC 


GGTGCCCATC 


94^fift 


AGCGGGGACC 


CGGGCGAGCT 


GGCCCTGGAC 


9 449ft 
Z *± <±Z U 


GAGAACCCCC 


TGCCGATCAA 


CGTAGACCAC 


9 a /inn 


GCCGTGGTCA 


ACGACCCTCG 


GGGGCCGTTT 


94 S4ft 


GAGCGCGTCC 


TCGAGACGGC 


CGCCAGCGCC 


Z *± ouu 


TCCCGGGAGG 


AGCGTCTGCT 


GTACCTGATC 


9 4 ccn 

i 4DOU 


ACAAAACGCC 


GGGGGGACGA 


GGTTCCGCCC 


9499 0 
z f± / z u 


TGCGCCATCG 


GGCGGCGCCT 


TGGAACCATC 




ATCGCTCCGT 


TTCGCCACCT 


GGACCCGGCG 


9 4 P4 fl 

Z 0 4 u 


GAGGCCGAGC 


TCGCGCTGGC 


CGGGCGCACC 


o/qnn 


ACGCTGCTCT 


CCACCGCCGT 


CAACAACATG 


9 aq^a 


GAGCGGCGGC 


GGCAGGCCGG 


GATCGCCGGA 


9S09n 

£ JUZU 


AAAATATGGG 


GGGCGGAGTC 


TGCCCCTGCG 


Z JU OU 


GGTGCCATGG 


ACACATCCCC 


CGCCGCGAGC 


9 M 4ft 


CGTCAAGTCG 


CGTCGTCGTC 


GTCTTCTTCT 


Z jZUU 


CCCGTTTCGG 


CATCGGGCGC 


CCCGGCCCCT 


9S9fift 

Z jZ D U 


ATCCCCGCCT 


TTCATTACAA 


TCAGCTCGTC 


9 R^9ft 

Z D JZ V 


CTGACCGCGT 


GCGGCCTGCC 


GGCCGCGGGG 


9^ftft 


CCGTCCCCGC ACTACCCGCC TCCTCCCGCC 


9S44ft 


CCCCAGTCCC 


CTGGAGGCCC 


AGATCGCCGC 




GGCGGGTGGG 


CTTCCGGCGG 


CCGCCGGAGA 


Z J J D U 


CCGACACGAG 


GTGGAGCAGC 


CGGAGTACGA 


Z J oz U 


CCCGTATTAC CCGGGCGAGG CCCGCCCCGA 


4 JO ou 


GCGCCAGGCT 


TCCGGGCCCC ACGAAACCAT 




GCAGCAGGAA 


CTGGCGCACA 


TGCGCGCGCG 


9^flftft 


GGTGGGGCCC 


TACCACCACC 


CCCACGCAGA 


25860 


CCCCGCCGAG 


GCCGTCTATC 


TGCCGCCGCC 


25920 


CGGGGCGGTC 


CCCCCACCCT CGTATCCCCC 


25980 


GCTACATCAG 


CCCTCCCCCG 


CACACGCCCA 


26040 


TCCCCCCGCC 


GCGAGCTTAC 


CCCAACCCGA 


26100 


TAACGCCAGC 


AGCGCGGCCC 


ACGTGAACGT 


26160 


GTCACAGATG ATGGGGTCCC GCTAACTCGC 


26220 


GTTTTC ATAT ATTTTAAATA AACAAACAAC 


26280 
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CGGACAAAAG TATACGCACT TCGTGTGCTT GTGTTTTTGT TTGAGAGGGG GGGGGTGG 26339 
(2) INFORMATION FOR SEQ ID NO: 38: 

5 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 897 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

15 Val Ser Gly Arg Ala Gly Asp Pro Ala Gly Leu Pro Ala Pro Arg Gly 
15 10 15 

Gly Pro Thr Trp Pro Met Pro Ser Gly Gly Pro Pro Pro Glu Val Lys 

20 25 30 

Ala Gly Leu Arg Ala Asp Met Trp Gly Val Met Gly Gin Tyr Arg Glu 
20 35 40 45 

Ala Xaa Glu His Gin Thr Pro Asp Thr Glu Thr Val Val Ala Gly Met 

50 55 60 

His Pro Ala Leu Val Val Val Leu Lys Thr Met Phe Xaa Asp Ala Pro 
65 70 75 80 

25 Glu Thr Pro Val Leu Val Gin Phe Phe Ser Asp His Ala Pro Thr lie 

85 90 95 

Ala Lys Ala Val Ser Asn Ala lie Asn Ala Gly Ser Ala Ala Val Ala 

100 105 110 

Thr Asp Ala Ala Thr Val Asp Ala Ala Val Arg Ala His Gly Ala Asp 
30 115 120 125 

Ala Val Ser Ala Leu Gly Ala Ala Ala Arg Asp Pro Asp Leu Ser Phe 

130 135 140 

Leu Ala Ala Asp Ser Ala Ala Gly Tyr Val Lys Ala Thr Arg Leu Ala 
145 150 155 160 

35 Leu Glu Arg Ala He Asp Lys Leu Thr Thr Leu Gly Ser Ala Ala Ala 

165 170 175 

Asp Leu Val Phe His Ala Arg Arg Ala Cys Ala Gin Pro Glu Gly Asp 

180 185 190 

His Ala Ala Leu He Asp Ala Ala Ala Arg Ala Thr Thr Ala Ala Arg 
40 195 200 205 

Glu Ser Leu Ala Gly His Glu Ala Gly Phe Gly Gly Leu Leu His Ala 

210 215 220 

Glu Gly Thr Ala Gly Asp His Ser Pro Ser Gly Arg Ala Leu Gin Glu 
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225 230 235 240 

Leu Gly Lys Val lie Gly Ala Thr Arg Arg Arg Ala Glu Glu Leu Glu 

245 _ . 250 255 

Ala Ala Val Ala Asp Leu Thr Gly Lys Met Ala Ala Gin Arg Arg Ser 
5 260 265 270 

Ser Trp Ala Ala Gly Val Glu Ala Ala Leu Asp Arg Val Glu Asn Arg 

275 280 285 

Ala Glu Phe Asp Val Val Glu Leu Arg Arg Leu Gin Ala Gly Thr His 
290 295 300 

10 Gly Tyr Asn Pro Arg Asp Phe Arg Lys Arg Ala Glu Gin Ala Ala Asn 
305 310 315 320 

Ala Glu Ala Val Thr Leu Ala Leu Asp Thr Ala Phe Ala Phe Asn Pro 

325 330 335 

Tyr Thr Pro Glu Asn Gin Arg His Pro Met Leu Pro Pro Leu Ala Ala 
15 340 345 350 

lie His Arg Leu Gly Trp Ser Ala Ala Phe His Ala Ala Ala Glu Thr 

355 360 365 

Tyr Ala Asp Met Phe Arg Val Asp Ala Glu Pro Leu Ala Arg Leu Leu 
370 375 380 

20 Arg He Ala Glu Gly Leu Leu Glu Met Ala Gin Ala Gly Asp Gly Phe 
385 390 395 400 

He Asp Tyr His Glu Ala Val Gly Arg Leu Ala Asp Asp Met Thr Ser 

405 410 415 

Val Pro Gly Leu Arg Arg Tyr Val Pro Phe Phe Gin His Gly Tyr Ala 
25 420 425 430 

Asp Tyr Val Glu Leu Arg Asp Arg Leu Asp Ala He Arg Ala Asp Val 

435 440 445 

His Arg Ala Leu Gly Gly Val Pro Leu Asp Leu Ala Ala Ala Ala Glu 
450 455 460 

30 Gin He Ser Ala Ala Arg Asn Asp Pro Glu Ala Thr Ala Glu Leu Val 
465 470 475 480 

Arg Thr Gly Val Thr Leu Pro Cys Pro Ser Glu Asp Ala Leu Val Ala 

485 490 495 

Cys Ala Ala Ala Leu Glu Arg Val Asp Gin Ser Pro Val Lys Asn Thr 
35 500 505 510 

Ala Tyr Ala Glu Tyr Val Ala Phe Val Thr Arg Gin Asp Thr Ala Glu 

515 520 525 

Thr Lys Asp Ala Val Val Arg Ala Lys Gin Gin Arg Ala Glu Ala. Thr 
530 535 . 540 

40 Glu Arg Val Met Ala Gly Leu Arg Glu Ala Ala Arg Glu Arg Arg Ala 
545 550 555 560 

Gin He Glu Ala Glu Gly Leu Ala Asn Leu Lys Thr Met Leu Lys Val 
565 570 575 
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Val Ala Val Pro Ala Thr Val Ala Lys Thr Leu Asp Gin Ala Arg Ser 

580 585 590 

Val Ala Glu He Ala Asp Gin Val Glu Val Leu Leu Asp Gin Thr Glu 
595 600 605 

5 Lys Thr Arg Glu Leu Asp Val Pro Ala Val He Trp Leu Glu His Ala 
610 615 620 

Gin Arg Thr Phe Glu Thr His Pro Leu Ser Ala Arg Asp Gly Pro Gly 
625 630 635 640 

Pro Leu Ala Arg His Ala Gly Arg Leu Gly Ala Leu Phe Asp Thr Arg 
10 645 650 655 

Arg Arg Val Asp Ala Leu Arg Arg Ser Leu Glu Glu Ala Glu Ala Glu 

660 665 670 

Trp Asp Glu Val Trp Gly Arg Phe Gly Arg Val Arg Gly Gly Ala Trp 
675 680 . 685 

15 Lys Ser Pro Glu Gly Phe Arg Ala Met His Glu Gin Leu Arg Ala Leu 
690 695 700 

Gin Asp Thr Thr Asn Thr Val Ser Gly Leu Arg Ala Gin Pro Ala Tyr 
705 710 715 720 

Glu Arg Leu Ser Ala Arg Tyr Gin Gly Val Leu Gly Ala Lys Gly Ala 
20 725 730 735 

Glu Arg Ala Glu Ala Val Glu Glu Leu Gly Ala Arg Val Thr Lys His 

740 745 750 

Thr Ala Leu Cys Ala Arg Leu Arg Asp Glu Val Val Arg Arg Val Pro 
75.5 760 765 

25 Trp Glu Met Asn Phe Asp Ala Leu Gly Arg Leu Leu Ala Glu Phe Asp 
770 775 780 

Ala Ala Ala Ala Asp Leu Ala Pro Trp Ala Val Glu Glu Phe Arg Gly 
785 ' 790 795 800 

Ala Arg Glu Leu He Gin Tyr Arg Met Gly Ser Ala Tyr Ala Arg Ala 
30 805 810 815 

Gly Gly Lys Ala Leu Phe Leu Phe Phe Phe Phe Pro Pro Pro Leu Ser 

820 825 830 

Ser Phe Leu Pro His Phe His Phe Phe He His His His His Ser Phe 
835 840 845 

35 Thr Lys Phe Phe Thr Ser Ser Ser Leu His Ser Tyr His Leu Phe Pro 
850 855 860 

Ser Ser He Tyr Ser He Pro Ser He Ser Pro Leu Tyr Pro His Ser 
865 870 875 880 

Ser Leu Ser Phe Pro Ser Ser Gin Phe Leu His He Phe Leu Ser Leu 
40 885 890 895 

Pro 
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(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 335 amino acids 

5 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Val Met Pro Val Ala Pro Pro Pro Arg Gly Ala Gly Gly Arg Ala Pro 
15 10 15 

15 Cys Pro Pro Ala Leu Gly Pro Glu Ala lie His Ala Arg Leu Glu Asp 
20 25 30 

Val Arg He Gin Ala Arg Arg Ala He Glu Ser Ala He Lys Glu Tyr 

35 40 45 

Phe His Arg Gly Ala Val Tyr Ser Ala Lys Ala Leu Gin Ala Ser Asp 
20 50 55 60 

Ser His Asp Cys Arg Phe His Val Ala Ser Ala Ala Val Val Pro Met 
65 70 75 80 

Val Gin Leu Leu Glu Ser Leu Pro Ala Phe Asp Gin His Thr Arg Asp 
85 90 95 

25 Val Ala Gin Arg Ala Ala Leu Pro Pro Pro Pro Pro Leu Ala Thr Ser 
100 105 110 

Pro Gin Ala He Leu Leu Arg Asp Leu Leu Gin Arg Gly Gin Thr Leu 

115 " ' 120 125 

Asp Ala Pro Glu Asp Leu Ala Ala Trp Leu Ser Val Leu Thr Asp Ala 
30 130 135 140 

Ala Thr Gin Gly Leu He Glu Arg Lys Pro Leu Glu Glu Leu Ala Arg 
145 150 155 160 

Ser He His Gly He Asn Asp Gin Gin Ala Arg Arg Ser Ser Gly Leu 
165 170 175 

35 Ala Glu Leu Gin Arg Phe Asp Ala Leu Asp Ala Ala Gin Gin Leu Asp 
180 185 190 

Ser Asp Ala Ala Phe Val Pro Ala Thr Gly Pro Ala Pro Tyr Val Asp 

195 200 205 

Gly Gly Gly Leu Ser Pro Glu Ala Thr Arg Met Ala Glu Asp Ala Leu 
40 210 215 220 

Arg Gin Ala Arg Ala Met Glu Ala Ala Lys Met Thr Ala Glu Leu Ala 
225 230 235 240 

Pro Glu Ala Arg Ser Arg Leu Arg Glu Arg Ala His Ala Leu Glu Ala 
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WO 98/20016 



PCTYUS97/20016 



245 250 255 

Met Leu Asn Asp Ala Arg Glu Arg Ala Lys Val Ala His Asp Ala Arg 

260 265 270 

Glu Lys Phe Leu His Lys Leu Gin Gly Val Leu Arg Pro Leu Pro Asp 
5 275 280 285 

Phe Val Gly Leu Lys Ala Cys Pro Ala Val Leu Ala Thr Leu Arg Ala 

290 295 300 

Ser Leu Pro Arg Gly Val Asp Arg Pro Gly Arg Cys Arg Pro Gly Ala 
305 310 315 320 

10 Pro Pro Arg Lys Ser Arg Arg Gly Cys Gly Arg Thr Cys Gly Gly 

325 330 335 

(2) INFORMATION FOR SEQ ID NO: 40: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 800 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 

25 Val Val Thr Gly Val Arg Asn Gin Phe Ala Thr Asp Leu Glu Pro Gly 
15 10 15 

Gly Ser Val Ser Cys Met Arg Ser Ser Leu Ser Phe Leu Ser Leu Leu 

20 '25 30 

Phe Asp Val Gly Pro Arg Asp Val Leu Ser Ala Glu Ala lie Glu Gly 
30 35 40 45 

Cys Leu Val Glu Gly Gly Glu Trp Thr Arg Ala Ala Ala Gly Ser Gly 

50 55 60 

Pro Pro Arg Met Cys Ser lie lie Glu Leu Pro Asn Phe Leu Glu Tyr 
65 70 75 80 

35 Pro Ala Arg Gly Leu Arg Cys Val Phe Ser Arg Val Tyr Gly Glu Val 

85 90 95 

Gly Phe Phe Gly Glu Pro Thr Ala Gly Leu Leu Glu Thr Gin Cys Pro 

100 105 110 

Ala His Thr Phe Phe Ala Gly Pro Trp Ala Met Arg Pro Leu Ser Tyr 
40 115 120 125 

Thr Leu Leu Thr lie Gly Pro Leu Gly Met Gly Arg Asp Gly Asp Thr 

130 135 140 

Ala Tyr Leu Phe Asp Pro His Gly Leu Pro Ala Gly Thr Pro Ala Phe 
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145 150 155 160 

lie Ala Lys Val Arg Ala Gly Asp Val Tyr Pro Tyr Leu Thr Tyr Tyr 

165 ^ 170 175 

Ala His Asp Arg Pro Lys Val Arg Trp Ala Gly Ala Met Val Phe Phe 
5 180 185 190 

Val Pro Ser Gly Pro Gly Ala Val Ala Pro Ala Asp Leu Thr Ala Ala 

195 200 205 

Ala Leu His Leu Tyr Gly Ala Ser Glu Thr Tyr Leu Gin Asp Glu Pro 
210 215 220 

10 Phe Val Glu Arg Arg Val Ala He Thr His Pro Leu Arg Gly Glu He 
225 230 235 240 

Gly Gly Leu Gly Ala Leu Phe Val Gly Val Val Pro Arg Gly Asp Gly 

245 250 255 

Glu Gly Ser Gly Pro Val Val Pro Ala Leu Pro Ala Pro Thr His Val 
15 260 265 270 

Gin Thr Pro Arg Ala Asp Arg Pro Pro Glu Ala Pro Arg Gly Ala Ser 

275 280 285 

Gly Pro Pro Asn Thr Pro Gin Ala Gly His Pro Asn Arg Pro Pro Asp 
290 295 300 

20 Asp Val Trp Ala Ala Ala Leu Glu Gly Thr Pro Pro Ala Lys Pro Ser 
305 310 315 320 

Ala Pro Asp Ala Ala Ala Ser Gly Pro Pro His Ala Ala Pro Pro Pro 

325 330 335 

Gin Thr Pro Ala Gly Asp Ala Ala Glu Glu Ala Glu Asp Leu Arg Val 
25 340 345 350 

Leu Glu Val Gly Ala Val Pro Val Gly Cys His Arg Ala Arg Tyr Ser 

355 360 365 

Thr Gly Leu Pro Lys Arg Arg Arg Pro Thr Trp Thr Pro Pro Ser Ser 
370 375 380 

30 Val Glu Asp Leu Thr Ser Gly Glu Arg Pro Ala Pro Lys Ala Pro Pro 
385 390 395 400 

Ala Lys Ala Lys Lys Lys Ser Ala Pro Lys Lys Lys Ala Pro Val Ala 

405 410 415 

Ala Glu Val Pro Ala Ser Ser Pro Thr Pro He Ala Ala Thr Val Pro 
35 420 425 430 

Pro Ala Pro Asp Thr Pro Pro Gin Ser Gly Gin Gly Gly Gly Asp Asp 

435 440 445 

Gly Pro Asp Ser Ser Pro Ser Val Leu Glu Thr Leu Gly Ala Arg Arg. 
450 455 460 

40 Pro Pro Glu Pro Pro Gly Ala Asp Leu Ala Gin Leu Phe Glu Val His 
465 470 475 480 

Pro Asn Val Ala Ala Thr Ala Val Arg Leu Ala Ala Arg Asp Ala Ala 
485 490 495 
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Arg Glu Val Ala Ala Cys Ser Gin Leu Thr lie Asn Ala Leu Arg Ser 

500 505 510 

Pro Tyr Pro Ala His Pro Gly Leu Leu Glu Leu Cys Val lie Phe Phe 
515 520 525 

5 Phe Glu Arg Val Leu Ala Phe Leu lie Glu Asn Gly Ala Arg Thr His 
530 535 540 

Thr Gin Ala Gly Val Ala Gly Pro Ala Ala Ala Leu Leu Asp Phe Thr 
545 550 555 560 

Leu Arg Met Pro Pro Arg Lys Thr Ala Val Gly Asp Phe Leu Ala Ser 
10 565 570 575 

Thr Arg Met Ser Leu Ala Asp Val Ala Ala His Arg Pro Leu lie Gin 

580 585 590 

His Val Leu Asp Lys Asn Ser Gin lie Gly Arg Leu Ala Lys Leu Val 
595 600 605 

15 Leu Val Ala Arg Asp Phe lie Arg Glu Thr Asp Ala Phe Tyr Gly Asp 
610 615 620 

Leu Ala Asp Leu Asp Leu Gin Leu Arg Ala Ala Pro Pro Ala Asn Leu 
625 630 635 640 

Tyr Ala Arg Leu Gly Lys Trp Leu Leu Glu Arg Ser Arg Ala His Pro 
20 645 650 655 

Asn Thr Leu Phe Ala Pro Ala Thr Pro Thr His Pro Glu Pro Leu Leu 

660 665 670 

His Arg He Gin Ala His Phe Arg Lys Lys Met Arg Val Glu Ala Glu 
675 680 685 

25 Ala Arg Glu Met Arg Glu Ala Leu Tyr Arg Val Tyr Ser Val Ser Gin 
690 695 700 

Arg Ala Gly Pro Pro Asp Arg Asp Ala Arg Cys Pro Pro Pro Pro Gly 
705 710 715 ' * 720 

Arg Arg Arg Gin Gly Pro Val Pro Ala Arg Pro Gly Pro Arg Gly His 
30 725 730 735 

Pro Cys Ala Ala Gly Gly Arg Ala Asp Pro Gly Pro Pro Gly Asp Arg 

740 745 750 

Lys Arg Asp Gin Gly Val Leu Pro Pro Gly Ser Arg He Gin Arg Glu 
755 760 765 

35 Gly Pro Ala Gly Gin Arg Gin Pro Arg Leu Ser Val Ser Arg Gly Leu 
770 775 780 

Gly Arg Gly Arg Ala His Gly Pro Val Ala Gly He Ala Thr Gly Leu 
785 790 795 800 

40 (2) INFORMATION FOR SEQ ID NO: 41: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 158 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Met Asn Ala His Phe Ala Asn Glu Val Gin Tyr Asp Leu Thr Arg Asp 
10 1 5 10 15 

Pro Ser Ser Pro Ala Ser Leu lie His Val lie lie Ser Ser Glu Cys 

20 25 30 

Leu Ala Ala Ala Gly Val Pro Leu Ser Ala Leu Val Arg Gly Arg Pro 
35 40 45 

15 Asp Gly Gly Ala Ala Ala Asn Phe Arg Val Glu Thr Gin Trp His Ala 
50 55 60 

Pro Gly Asp Cys Thr Pro Trp Arg Ser Ala Phe Ala Ala Tyr Val Pro 
65 70 75 80 

Ala Asp Ala Val Gly Ala lie Leu Ala Pro Val lie Pro Ala His Pro 
20 85 90 95 

Asp Leu Leu Pro Arg Val Pro Ser Ala Gly Gly Leu Phe Val Ser Leu 

100 105 110 

Pro Val Ala Cys Asp Ala Gin Gly Val Tyr Asp Pro Tyr Thr Val Ala 
115 120 125 

25 Ala Leu Arg Leu Ala Trp Gly Pro Trp Ala Thr Cys Ala Arg Val Leu 
130 135 140 

Leu Phe Ser Tyr Asp Glu Leu Thr Arg Tyr Arg Val Cys Gly 
145 150 155 

30 (2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 423 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



Val Pro Glu Gly Ala Trp Val Gly Gly Ala Cys Ala Arg Pro Arg Gly 
1 5 10 15 
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Pro Arg Ala His Val Arg Leu Tyr Ala Val Cys Phe Val Cys Pro Gin 

20 25 30 

Gly lie Arg Gly Gin Asp Phe Asn Leu Leu Phe Val Asp Glu Ala Asn 
35 40 45 

5 Phe lie Arg Pro Asp Ala Val Gin Thr He Met Gly Phe Leu Asn Gin 
50 55 60 

Ala Asn Cys Lys He He Phe Val Ser Ser Thr Asn Thr Gly Lys Ala 
65 70 75 80 

Ser Thr Ser Phe Leu Tyr Asn Leu Arg Gly Ala Ala Asp Glu Leu Leu 
10 85 90 95 

Asn Val Val Thr Tyr He Cys Asp Asp His Met Pro Arg Val Val Thr 

100 105 110 

His Thr Asn Ala Thr Ala Cys Ser Cys Tyr He Leu Asn Lys Pro Val 
115 120 125 

15 Phe He Thr Met Asp Gly Ala Val Arg Arg Thr Ala Asp Leu Phe Leu 
130 135 140 

Pro Asp Ser Phe Met Gin Glu He He Gly Gly Gin Ala Arg Glu Thr 
145 150 155 160 

Gly Asp Asp Arg Pro Val Leu Thr Lys Ser Ala Gly Glu Arg Phe Leu 
20 165 170 175 

Leu Tyr Arg Pro Ser Thr Thr Thr Asn Ser Gly Leu Met Ala Pro Glu 

180 185 190 

Leu Tyr Val Tyr Val Asp Pro Ala Phe Thr Ala Asn Thr Arg Ala Ser 
195 200 205 

25 Gly Thr Gly He Ala Val Val Gly Arg Tyr Arg Asp Asp Phe He He 
210 215 220 

Phe Ala Leu Glu His Phe Phe Leu Arg Ala Leu Thr Gly Ser Ala Pro 
225 230 235 240 

Ala Asp He Ala Arg Cys Val Val His Ser Leu Ala Gin Val Leu Ala 
30 245 250 255 

Leu His Pro Gly Ala Phe Arg Ser Val Arg Val Ala Val Glu Gly Asn 

260 265 270 

Ser Ser Gin Asp Ser Ala Val Ala lie Ala Thr His Val His Thr Glu 
275 280 285 

35 Met His Arg He Leu Ala Ser Ala Gly Ala Asn Gly Pro Gly Pro Glu 
290 295 300 

Leu Leu Phe Tyr His Cys Glu Pro Pro Gly Gly Ala Val Leu Tyr Pro 
305 310 315 320 

Phe Phe Leu Leu Asn Lys Gin Lys Thr Pro Ala Phe Glu Tyr Phe He 
40 325 330 335 

Lys Lys Phe Asn Ser Gly Gly Val Met Ala Ser Gin Glu Leu Val Ser 

340 345 350 

Val Thr Val Arg Leu Gin Thr Asp Pro Val Glu Tyr Leu Ser Glu Gin 
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355 360 365 

Leu Asn Asn Leu lie Glu Thr Val Ser Pro Asn Thr Asp Val Arg Met 

370 375 380 

Tyr Ser Gly Lys Arg Asn Gly Ala Ala Asp Asp Leu Met Val Ala Val 
5 385 390 395 400 

lie Met Ala He Tyr Leu Ala Ala Pro Thr Gly He Pro Pro Ala Phe 

405 410 415 

Phe Pro He Thr Arg Thr Ser 
420 

10 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 355 amino acids 

15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



Val Leu Leu Ser Pro Ala Pro Pro Pro Leu Pro His Gly Arg Cys Pro 
15 10 15 

25 Pro Ser Leu Phe His His Arg Pro Gly Cys Val Ser Gly Pro Pro Ala 
20 25 30 

Pro Pro Arg Ser Gly Val Ser Arg Pro Gly Ala Met He Thr Asp Cys 

35 40 45 

Phe Glu Ala Asp lie Ala He Pro Ser Gly lie Ser Arg Pro Asp Ala 
30 50 55 60 

Ala Ala Leu Gin Arg Cys Glu Gly Arg Val Val Phe Leu Pro Thr lie 
65 70 75 80 

Arg Arg Gin Leu Ala Asp Val Ala His Glu Ser Phe Val Ser Gly Gly 
85 90 95 

35 Val Ser Pro Asp Thr Leu Gly Leu Leu Leu Ala Tyr Arg Arg Arg Phe 
100 105 110 

Pro Ala Val He Thr Arg Val Leu Pro Thr Arg lie Val Ala Cys Pro 

115 120 125 

Val Asp Leu Gly Leu Thr His Ala Gly Thr Val Asn Leu Arg Asn Thr 
40 130 135 140 

Ser Pro Val Asp Leu Cys Asn Gly Asp Pro Val Ser Leu Val Pro Pro 
145 150 155 160 

Val Phe Glu Gly Gin Ala Thr Asp Val Arg Leu Glu Ser Leu Asp Leu 
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165 170 175 

Thr Leu Arg Phe Pro Val Pro Leu Pro Thr Pro Leu Ala Arg Glu lie 

180 185 190 

Val Ala Arg Leu Val Arg lie Arg Asp Leu Asn Pro Asp Pro Arg Thr 
5 195 200 205 

Pro Gly Glu Leu Pro Asp Leu Asn Val Leu Tyr Tyr Asn Gly Ala Arg 

210 215 220 

Leu Ser Leu Val Ala Asp Val Gin Gin Leu Ala Ser Val Asn Thr Glu 
225 230 235 240 

10 Leu Arg Ser Leu Val Leu Asn Met Val Tyr Ser He Thr Glu Gly Thr 

245 250 255 

Thr Leu He Leu Thr Leu He Pro Arg Leu Leu Ala Leu Ser Ala Gin 

260 265 270 

Asp Gly Tyr Val Asn Ala Leu Leu Gin Met Gin Ser Val Thr Arg Glu 
15 275 280 285 

Ala Ala Gin Leu He His Pro Glu Ala Pro Met Leu Met Gin Asp Gly 

290 295 300 

Glu Arg Arg Leu Pro Leu Tyr Glu Ala Leu Val Ala Trp Leu Ala His 
305 310 315 320 

20 Ala Gly Gin Leu Gly Asp He Leu Ala Pro Ala Val Arg Val Cys Thr 

325 330 335 

Phe Asp Gly Ala Ala Val Val Gin Ser Gly Asp Met Ala Pro Val He 
340 345 350 

Arg Tyr Pro 
25 355 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1382 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Val Trp Glu Gly Leu Gly Leu Pro Glu Leu Gly Leu Met Glu Pro Ala 
40 1 5 10 15 

Asn Pro Pro Arg Asn Pro Met Ala Ala Pro Ala Arg Asp Pro Pro Gly 

20 25 30 

Tyr Arg Tyr Ala Ala Ala Met Val Pro Thr Gly Ser He Leu Ser Thr 
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35 40 45 

lie Glu Val Ala Ser His Arg Arg Leu Phe Asp Phe Phe Ala Arg Val 

50 55 60 

Arg Ser Asp Glu Asn Ser Leu Tyr Asp Val Glu Phe Asp Ala Leu Leu 
5 65 70 75 80 

Gly Ser Tyr Cys Asn Thr Leu Ser Leu Val Arg Phe Leu Glu Leu Gly 

85 90 95 

Leu Ser Val Ala Cys Val Cys Thr Lys Phe Pro Glu Leu Ala Tyr Met 
100 105 110 

10 Asn Glu Gly Arg Val Gin Phe Glu Val His Gin Pro Leu He Ala Arg 
115 120 125 

Asp Gly Pro His Pro Val Glu Gin Pro Val His Asn Tyr Met Thr Lys 

130 135 140 

Val He Asp Arg Arg Ala Leu Asn Ala Ala Phe Ser Leu Ala Thr Glu 
15 145 150 155 160 

Ala He Ala Leu Leu Thr Gly Glu Ala Leu Asp Gly Thr Gly He Ser 

165 170 175 

Leu His Arg Gin Leu Arg Ala He Gin Gin Leu Ala Arg Asn Val Gin 
180 185 190 

20 Ala Val Leu Gly Ala Phe Glu Arg Gly Thr Ala Asp Gin Met Leu His 
195 200 205 

Val Leu Leu Glu Lys Ala Pro Pro Leu Ala Leu Leu Leu Pro Met Gin 

210 215 220 

Arg Tyr Leu Asp Asn Gly Arg Leu Ala Thr Arg Val Ala Arg Ala Thr 
25 225 230 235 240 

Leu Val Ala Glu Leu Lys Arg Ser Phe Cys Asp Thr Ser Phe Phe Leu 

245 250 255 

Gly Lys Ala Gly His Arg Arg Glu Ala He Glu Ala Trp Leu Val Asp 
260 265 270 

30 Leu Thr Thr Ala Thr Gin Pro Ser Val Ala Val Pro Arg Leu Thr His 
275 280 285 

Ala Asp Thr Arg Gly Arg Pro Val Asp Gly Val Leu Val Thr Thr Ala 

290 295 300 

Ala He Lys Gin Arg Leu Leu Gin Ser Phe Leu Lys Val Glu Asp Thr 
35 305 310 315 320 

Glu Ala Asp Val Pro Val Thr Tyr Gly Glu Met Val Leu Asn Gly Ala 

325 330 335 

Asn Leu Val Thr Ala Leu Val Met Gly Lys Ala Val Arg Ser Leu Asp 
340 345 350 

40 Asp Val Gly Arg His Leu Leu Glu Met Gin Glu Glu Gin Leu Glu Ala 
355 360 365 

Asn Arg Glu Thr Leu Asp Glu Leu Glu Ser Ala Pro Gin Thr Thr Arg 
370 375 380 
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Val Arg Ala Asp Leu Val Ala lie Gly Asp Arg Leu Val Phe Leu Glu 
385 390 395 400 

Ala Leu Glu Lys Arg lie Tyr Ala- Ala Thr Asn Val Pro Tyr Pro Leu 
405 410 415 

5 Val Gly Ala Met Asp Leu Thr Phe Val Leu Pro Leu Gly Leu Phe Asn 
420 425 430 

Pro Ala Met Glu Arg Phe Ala Ala His Ala Gly Asp Leu Val Pro Ala 

435 440 445 

Pro Gly His Pro Glu Pro Arg Ala Phe Pro Pro Arg Gin Leu Phe Phe 
10 450 455 460 

Trp Gly Lys Asp His Gin Val Leu Arg Leu Ser Met Glu Asn Ala Val 
465 470 475 480 

Gly Thr Val Cys His Pro Ser Leu Met Asn lie Asp Ala Ala Val Gly 
485 490 495 

15 Gly Val Asn His Asp Pro Val Glu Ala Ala Asn Pro Tyr Gly Ala Tyr 
500 505 510 

Val Ala Ala Pro Ala Gly Pro Gly Ala Asp Met Gin Gin Arg Phe Leu 

515 520 525 

Asn Ala Trp Arg Gin Arg Leu Ala His Gly Arg Val Arg Trp Val Ala 
20 530 535 540 

Glu Cys Gin Met Thr Ala Glu Gin Phe Met Gin Pro Asp Asn Ala Asn 
545 550 555 560 

Leu Ala Leu Glu Leu His Pro Ala Phe Asp Phe Phe Ala Gly Val Ala 
565 570 575 

25 Asp Val Glu Leu Pro Gly Gly Glu Val Pro Pro Ala Gly Pro Gly Ala 
580 585 590 

lie Gin Ala Thr Trp Arg Val Val Asn Gly Asn Leu Pro Leu Ala Leu 

595 600 605 

Cys Pro Val Ala Phe Arg Asp Arg Leu Glu Leu Gly Val Gly Arg His 
30 610 615 620 

Ala Met Ala Pro Ala Thr lie Ala Ala Val Arg Gly Ala Phe Glu Asp 
625 630 635 640 

Arg Ser Tyr Pro Ala Val Phe Tyr Leu Leu Gin Ala Ala lie His Gly 
645 650 655 

35 Ser Glu His Val Phe Cys Ala Arg Leu Val Thr Gin Cys lie Thr Ser 
660 665 670 

Tyr Trp Asn Asn Thr Arg Cys Ala Ala Phe Val Asn Asp Tyr Ser Leu 

675 680 685 

Val Ser Tyr lie Val Thr Tyr Leu Gly Gly Asp Leu Pro Glu Glu Cys 
40 690 695 700 

Met Ala Val Tyr Arg Asp Leu Val Ala His Val Glu Ala Gin Leu Val 
705 710 715 720 

Asp Asp Phe Thr Leu Pro Gly Pro Glu Leu Gly Gly Gin Ala Gin Ala 
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725 



730 



735 



Glu Leu Asn His Leu Met Arg Asp Pro Ala Leu Leu Pro Pro Leu Val 

740 745 750 

Trp Asp Cys Asp Gly Leu Met Arg His Ala Ala Leu Asp Arg His Arg 

755 760 765 

Asp Cys Arg lie Asp Ala Gly Gly His Glu Pro Val Tyr Ala Ala Ala 

770 775 780 

Cys Asn Val Ala Thr Ala Asp Phe Asn Arg Asn Asp Gly Arg Leu Leu 



10 His Asn Thr Gin Ala Arg Ala Ala Asp Ala Ala Asp Asp Arg Pro His 

805 810 815 

Arg Pro Ala Asp Trp Thr Val His His Lys lie Tyr Tyr Tyr Val Leu 

820 825 830 

Val Pro Ala Phe Ser Arg Gly Arg Cys Cys Thr Ala Gly Val Arg Phe 
15 835 840 845 

Asp Arg Val Tyr Ala Thr Leu Gin Asn Met Val Val Pro Glu lie Ala 

850 855 860 

Pro Gly Glu Glu Cys Pro Ser Asp Pro Val Thr Asp Pro Ala His Pro 
865 870 875 880 

20 Leu His Pro Ala Asn Leu Val Ala Asn Thr Val Asn Ala Met Phe His 

885 890 895 

Asn Gly Arg Val Val Val Asp Gly Pro Ala Met Leu Thr Leu Gin Val 

900 905 910 

Leu Ala His Asn Met Ala Glu Arg Thr Thr Ala Leu Leu Cys Ser Ala 
25 915 920 925 

Ala Pro Asp Ala Gly Ala Asn Thr Ala Ser Thr Ala Asn Met Arg lie 
930 935 940 

. Phe Asp Gly Ala Leu His Ala Gly Val Leu Leu Met Ala Pro Gin His 
945 950 955 960 

30 Leu Asp His Thr lie Gin Asn Gly Glu Tyr Phe Tyr Val Leu Pro Val 

965 970 975 

His Ala Leu Phe Ala Gly Ala Asp His Val Ala Asn Ala Pro Asn Phe 

980 985 990 

Pro Pro Ala Leu Arg Asp Leu Ala Arg His Val Pro Leu Val Pro Pro 
35 995 1000 1005 

Ala Leu Gly Ala Asn Tyr Phe Ser Ser lie Arg Gin Pro Val Val Gin 

1010 1015 1020 

His Ala Arg Glu Ser Ala Ala Gly Glu Asn Ala Leu Thr Tyr Ala Leu 
1025 1030 1035 104 

40 Met Ala Gly Tyr Phe Lys Met Ser Pro Val Tyr His Gin Leu Lys Thr 

1045 1050 1055 

Gly Leu His Pro Gly Phe Gly Phe Thr Val Val Arg Gin Asp Arg Phe 



785 



790 



795 



800 



1060 



1065 



1070 
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Val Thr Glu Asn Val Leu Phe Ser Ala Ser Glu Ala Tyr Phe Leu Gly 

1075 1080 1085 

Gin Leu Gin Val Ala Arg His Glu Thr Gly Gly Gly Val Ser Phe Thr 
1090 1095 1100 

5 Leu Thr Gin Pro Arg Gly Asn Val Asp Leu Gly Val Gly Tyr Thr Ala 
1105 1110 1115 112 

Val Ala Ala Thr Ala Thr Val Arg Asn Pro Val Thr Asp Met Gly Asn 

1125 1130 1135 

Leu Pro Gin Asn Phe Tyr Leu Gly Arg Gly Ala Pro Pro Leu Leu Asn 
10 1140 1145 1150 

Asn Ala Ala Ala Val Tyr Leu Arg Asn Ala Val Val Ala Gly Asn Arg 

1155 1160 1165 

Leu Gly Pro Ala Gin Pro Leu Pro Val Phe Gly Cys Ala Gin Val Pro 
1170 1175 1180 

15 Arg Arg Ala Gly Met Asp His Gly Gin Asp Ala Val Cys Glu Phe lie 
1185 1190 1195 120 

Ala Thr Pro Val Ala Thr Asp He Asn Tyr Phe Arg Arg Pro Cys Asn 

1205 1210 1215 

Pro Arg Gly Arg Ala Ala Gly Gly Val Tyr Ala Gly Asp Lys Glu Gly 
20 1220 1225 . 1230 

Asp Val He Ala Leu Met Tyr Asp His Gly Gin Ser Asp Pro Ala Arg 

1235 1240 1245 

Pro Phe Ala Ala Thr Ala Asn Pro Trp Ala Ser Gin Arg Phe Ser Tyr 
1250 1255 1260 

25 Gly. Asp Leu Leu Tyr Asn Gly Ala Tyr His Leu Asn Gly Asp Val Leu 
1265 1270 1275 128 

Ser Pro Cys Phe Lys Phe Phe Thr Ala Ala Asp He Thr Ala Lys His 
* • 1285 1290 1295 

Arg Cys Leu Glu Arg Leu He Val Glu Thr Gly Ser Ala Val Ser Thr 
30 1300 1305 1310 

Ala Thr Ala Ala Ser Asp Val Gin Phe Lys Arg Pro Pro Gly Cys Arg 

. 1315 1320 1325 

Glu Leu Val Glu Asp Pro Cys Gly Leu Phe Gin Glu Ala Tyr Pro He 
1330 1335 1340 

35 Thr Cys Ala Ser Asp Pro Ala Leu Leu Arg Ser Ala Arg Asp Gly Glu 
1345 1350 1355 136 

Ala His Ala Arg Glu Thr His Phe Thr Gin Tyr Leu lie Tyr Asp Asp 

1365 1370 1375 

Leu Lys Gly Leu Ser Leu 
40 1380 



(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



Met Thr Met Arg Asp Asp Val Pro Leu Leu Asp Arg Glu Leu Val Tyr 

1 5 10 15 

Glu Ala Ala Cys Gly Gly Glu Asp Gly Glu Leu Pro Leu Asp Glu Gin 
20 25 30 

15 Phe Ser Leu Ser Ser Tyr Gly Thr Ser Asp Phe Phe Val Ser Ser Ala 
35 40 45 

Tyr Ser Arg Leu Pro Pro His Thr Gin Pro Val Phe Ser Lys Arg Val 

50 55 60 

Val Met Phe Ala Trp Ser Phe Leu Val Leu Lys Pro Leu Glu Leu Val 
20 65 70 75 80 

Ala Ala Gly Met Tyr Tyr Gly Trp Thr Gly Arg Ala Val Ala Pro Ala 

85 90 95 

Cys lie lie Ala Ala Val Leu Ala Tyr Tyr Val Thr Trp Leu Ala Arg 
100 105 110 

25 Ala Leu Leu Leu Tyr Val Asn lie Lys Arg Asp Arg Leu Pro Leu Ser 
115 120 125 

Pro Pro Val Phe Trp Gly Leu Cys Val lie Met Gly Gly Ala Ala Leu 

130- • 135 140 

Cys Ala Leu Val Ala Ala Ala His Glu Thr Phe Ser Pro Asp Gly Leu 
30 145 150 155 160 

Phe His Trp lie Thr Ala Ser Gin Leu Leu Pro Arg Thr Asp Pro Leu 

165 170 175 

Arg Ala Arg Ser Leu Gly He Ala Cys Ala Ala Gly Ala Ala Met Trp 
180 185 190 

35 Val Ala Ala Ala Asp Cys Phe Ala Ala Phe Thr Asn Phe Phe Leu Ala 
195 200 205 

Arg Phe Trp Thr Arg Ala He Leu Lys Ala Pro Val Ala Phe 
210 215 220 

40 (2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 627 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

Val Gly Arg Gin Gly Glu Arg Trp Val Gly Gly Gly Ash Glu Lys Asn 
10 1 5 10 15 

Thr Gin Arg Ala Thr Ser Gly Met Arg Pro Glu Leu Ser Leu Lys Gly 

20 25 30 

Arg Cys 'Val Thr Glu Ala Val Val Cys Pro Ser Thr Asp Ala Ala 

>i; *&f% 40 45 

15 i-.Ile His Ser Gly Gly Ser Ser Ser Val Arg Pro Gin Pro Tyr*Ala Arg 
50" 55 60 

Ala Ala Arg Ala Arg Ala Thr His Gly Ser Arg Ser Arg His Arg Gin 
65 70 75 80 

Pro Leu Leu Pro Pro Pro Ser Ser His His Pro Thr lie Pro Pro Pro 
20 85 90 95 

Pro Ser Pro Pro Arg Gly Ser Pro Ala Met Glu Leu Thr Tyr Ala Thr 

100 105 110 

Thr Leu His His Arg Asp Val Val Phe Tyr Val Thr Ala Asp Arg Asn 
115 120 125 

25 Arg Ala Tyr Phe Val Cys Gly Gly Ser Val Tyr Ser Val Gly Arg Pro 
130 135 140 

Arg Asp Ser Gin Pro Gly Glu lie Ala Lys Phe Gly Leu Val Val Arg 
145 150 -155 160 

Gly Thr Gly Pro Lys Asp Arg Met Val Ala Asn Tyr Val Arg Ser Glu 
30 165 170 175 

Leu Arg Gin Arg Gly Leu Arg Glu Val Arg Pro Val Gly Glu Asp Glu 

180 185 190 

Val Phe Leu Asp Ser Val Cys Leu Leu Asn Pro Asn Val Ser Ser Asp 
195 200 205 

35 Val lie Asn Thr Asn Asp Val Glu Val Leu Asp Glu Cys Leu Ala Glu 
210 215 220 

Tyr Cys Thr Ser Leu Gin Thr Ser Pro Gly Val Leu Val Thr Gly Val 
225 230 235 240 

Arg Val Arg Ala Arg Asp Arg Val lie Glu Leu Phe Glu His Pro Ala 
40 245 250 255 

He Val Asn Xle Ser Ser Arg Phe Ala Tyr Thr Pro Ser Pro Tyr Val 

260 ^ 265 270 

Phe Ala Gin Ala His Leu Pro Arg Leu Pro Ser Ser Leu Glu Pro Leu 
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275 280 285 

Val Ser Gly Leu Phe Asp Gly lie Pro Ala Pro Arg Gin Pro Leu Asp 

290 295 * 300 

Ala Arg Asp Arg Arg Thr Asp Val Val lie Thr Gly Thr Arg Ala Pro 
5 305 310 315 320 

Arg Pro Met Ala Gly Thr Gly Ala Gly Gly Ala Gly Ala Lys Arg Ala 

325 330 335 

Thr Val Ser Glu Phe Val Gin Val Lys His He Asp Arg Val Val Ser 
340 345 350 

10 Pro Ser Val Ser Ser Ala Pro Pro Pro Ser Ala Pro Asp Ala Ser Leu 
355 360 365 

Pro Pro Pro Gly Leu Gin Glu Ala Ala Pro Pro Gly Pro Pro Leu Arg 

370 375 380 

Glu Leu Trp Trp Val Phe Tyr Ala Gly Asp Arg Ala Leu Glu Glu Pro 
15 385 390 395 400 

His Ala Glu Ser Gly Leu Thr Arg Glu Glu Val Arg Ala Val His Gly 

405 410 415 

Phe Arg Glu Gin Ala Trp Lys Leu Phe Gly Ser Val Gly Ala Pro Arg 
420 425 430 

20 Ala Phe Leu Gly Ala Ala Leu Ser Pro Thr Gin Lys Leu Ala Val Tyr 
435 440 445 

Tyr Tyr Leu He His Arg Glu Arg Arg Met Ser Pro Phe Pro Ala Leu 

450 455 460 

Val Arg Leu Val Gly Arg Tyr He Gin Arg His Gly Val Pro Ala Pro 
25 465 470 475 " 480 

Asp Glu Pro Thr Leu Ala Asp Ala Met Asn Gly Leu Phe Arg Asp Ala 

485 490 495 

Ala Gly Thr Val Ala Glu Gin Leu Leu Met Phe Asp Leu Leu Pro Pro 
500 505 510 

30 Lys Asp Val Pro Val Gly Ser Asp Ala Arg Ala Asp Ser Ala Ala Leu 
51? 520 525 

Leu Arg Phe Val Asp Ser Gin Arg Leu Thr Pro Gly Gly Ser, Veff^^it 

530 535 540 

Pro Glu His Val Met Tyr Leu Gly Ala Phe Leu Gly Val Leu Tyr Ala 
35 545 550 555 560 

Gly His Gly Arg Leu Ala Ala Ala Thr His Thr Ala Arg Leu Thr Gly 

565 570 575 

Val Thr Ser Leu Val Leu- Thr Val Gly Asp Val Asp Arg Met Ser Ala 
580 585 590 

40 Phe Asp Arg Gly Pro Ala Gly Ala Ala Gly Arg Thr Arg Thr Ala Gly 
595 600 605 

Tyr Leu Asp Ala Leu Leu Thr Val Cys Leu Ala Arg Ala Gin His Gly 
610 615 620 
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Gin Ser Val 
625 

(2) INFORMATION FOR SEQ ID NO: 47: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 592 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 



15 



40 



(ii) MOLECULE TYPE: peptide 



SEQUENCE DESCRIPTION: SEQ ID NO: 47: 



■ .Val Tyr Leu Ser Pro Ser Ala Leu Lys Trp Pro Val Gly Val Trp Thr 
15 10 15 

Thr Gly Gly Leu Ala Phe Gly Cys Asp Ala Ala Leu .Val Arg Ala Arg 
20 25 30 

20 Tyr Gly Lys Gly Phe Met Gly Val Val lie Ser Met Arg Asp Ser Pro 
35 40 4 5 

Pro Ala Glu He He Val Val Pro Ala Asp Lys Thr Leu Ala Arg Val 

50 55 60 

Gly Asn Pro Thr Asp Glu Asn Aia Pro Ala Val Leu Pro Gly Pro Pro 
25 '65 70 75 80 

Ala Gly Pro Arg Tyr Arg Val Phe Val Leu Gly Ala Pro Thr Pro Ala 

85 90 95 

Asp Asn Gly Ser Ala Leu Asp Ala Leu Arg Arg Val Ala Gly Tyr Pro 
100 105 no 

30 Glu Glu Ser Thr Asn Tyr Ala Gin Tyr Met Ser Arg Ala Tyr Ala Glu 
115 120 125 

Phe Leu Gly Glu Asp Pro Gly Ser Gly Thr Asp Ala Arg Pro Ser Leu 

130 135 140 

Phe Trp Arg Leu Ala Gly Leu Leu Ala Ser Ser Gly Phe Ala Phe Val 
35 145 150 155 160 

Asn Ala Ala His Ala His Asp Ala He Arg Leu Ser Asp Leu Leu Gly 

165 170 175 

Phe Leu Ala His Ser Arg Val Leu Ala Gly Leu Ala Arg Ala Ala Gly 

180 185 190 

Cys Ala Ala Asp Ser Val Phe Leu Asn Val Ser Val Leu Asp Pro Ala 

195 200 205 

Ala Arg Leu Arg Leu Glu Ala Arg Leu Gly His Leu Val Ala Ala He 
210 215 * 220 
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Arg Glu Gin Ser Leu Ala Ala His Ala Leu Gly Tyr Gin Leu Ala Phe 
225 230 235 240 

Val Leu Asp Ser Pro Ala Ala Tyr Gly Ala Val Ala Pro Ser Ala Ala 
245 250 255 

5 Arg Leu He Asp Ala Leu Tyr Ala Glu Phe Leu Gly Gly Arg Ala Leu 
260 265 270 

Thr Ala Pro . Met Val Arg Arg Ala Leu Phe Tyr Ala Thr Ala Val Leu 

275 280 285 

Arg Ala Pro Phe Leu Ala Gly Ala Pro Ser Ala Glu Gin Arg Glu Arg 
10 290 295 300 

Ala Arg Arg Gly Leu Leu He Thr Thr Ala Leu Cys Thr Ser Asp Val 
305 310 315 320 

Ala Ala Ala Thr His Ala Asp Leu Arg Ala Ala Arg Thr Asp His Gin 
325 330 335 

15 Lys Asn Leu Phe Trp Leu Pro Asp His Phe Ser Pro Cys Ala Ala Ser 
340 345 350 

Leu Arg Phe Asp Leu Ala Glu Gly Gly Phe He Leu Asp Ala Met Ala 

355 360 365 

Thr Arg Ser Asp. He Pro Ala Asp Val Met Ala Gin Gin Thr Arg Gly 
20 370 375 380 

Val Ala Ser Val Leu Thr Arg Trp Ala His Tyr Asn Ala Leu He Arg 
385 390 395 400 

Ala Phe Val Pro Glu Ala Thr His Gin Cys Ser Gly Pro Ser His Asn 
405 410 415 

25 Ala Glu Pro Arg He Leu Val Pro He Thr His Asn Ala Ser Tyr Val 
420 425 430 

Val Thr His Thr Pro Leu Pro Arg Gly He Gly Tyr Lys Leu Thr Gly. 

435 440 445 

Val Asp Val Arg Arg Pro Leu Phe He Thr Tyr Leu Thr Ala Thr Cys 
30 450 455 460 

Glu Gly His Ala Arg Glu He Glu Pro Lys Arg Leu Val Arg Thr Glu 
465 470 475 480 

Asn Arg Arg Asp Leu Gly Leu Val Gly Ala Val Phe Leu Arg Tyr Thr 
485 490 495 

35 Pro Ala Gly Glu Val Met Ser Val Leu Leu Val Asp Thr Asp Ala Thr 
500 505 510 

Gin Gin Gin Leu Ala Gin Gly Pro Val Ala Gly Thr Pro Asn Val Phe 

515 520 525 

Ser Ser Asp Val Pro Ser Val Leu Leu Phe Pro Asn Gly Thr Val lie 
40 530 535 540 

His Leu Leu Ala Phe Asp Thr Leu Pro He Ala Thr He Ala Pro Gly 
545 550 555 560 

Phe Leu Ala Ala Ser Ala Leu Gly Val Val Met He Thr Ala Ala Gly 
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565 570 575 

lie Leu Arg Val Val Arg Thr Cys Val Pro Phe Leu Trp Arg Arg Glu 
580 585 590 

5 (2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Val Ser He Ser Ala Gly Val Arg Gly Gin Gly Trp His Arg He Ser 

15 10 15 

Thr Pro Pro Lys Asn Gly Ala Gly Arg Ser Val Leu Val Phe Gly Leu 
20 ^ 20 25 30 

Val Leu Pro Leu Cys Phe Tyr Pro His Pro Thr Pro Ser Phe Gly Pro 

35 40 45 

Arg Leu Arg Gin Gin Arg Ala Ser Asp Ser Leu Arg Gly Ala Glu Pro 

50 55 60 

Leu Trp Ala Val Gly Thr Asp Thr Pro Pro Ser Ala Asp Trp Gin Pro 
65 7 0 75 80 

Gly Arg Thr Thr Met Gly Pro Gly Leu Trp Val Val Met Gly Val Leu 

85 90 95 

Val Gly Val Ala Gly Gly His Asp Thr Tyr Trp Thr Glu Gin He Asp 
30 100 105 no 

Pro Trp Phe Leu His Gly Leu Gly Leu Ala Arg Thr Tyr Trp Arg Asp 

115 120 125 

Thr Asn Thr Gly Arg Leu Trp Leu Pro Asn Thr Pro Asp Ala Ser Asp 

130 135 140 

Pro Gin Arg Gly Arg Leu Ala Pro Pro Gly Glu Leu Asn Leu Thr Thr 
145 150 155 160 

Ala Ser Val Pro Met Leu Arg Trp Tyr Ala Glu Arg Phe Cys Phe Val 

165 170 175 

Leu Val Thr Thr Ala Glu Phe Pro Arg Asp Pro Gly Gin Leu Leu Tyr 
40 180 185 190 

He Pro Lys Thr Tyr Leu Leu Gly Arg Pro Arg Asn Ala Ser Leu Pro 

295 200 205 

Glu Leu Pro Glu Ala Gly Pro Thr Ser Arg Pro Pro Ala Glu Val Thr 
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210 215 220 

Gin Leu Lys Gly Leu Ser His Asn Pro Gly Ala Ser Ala Leu Leu Arg 
225 230 235 240 

Ser Arg Ala Trp Val Thr Phe Ala Ala Ala Pro Asp Arg Glu Gly Leu 
5 245 250 255 

Thr Phe Pro Arg Gly Asp Asp Gly Ala Thr Glu Arg His Pro Asp Gly 

260 265 270 

Arg Arg Asn Ala Pro Pro Pro Gly Pro Pro Ala Gly Thr Pro Arg His 
275 280 285 

10 Pro Thr Thr' Asn Leu Ser lie Ala His Leu His Asn Ala Ser Val Thr 
290 295 300 

Trp Leu Ala Arg Leu Leu Arg Thr Pro Gly Arg 
305 310 315 

15 (2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 370 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Met Ala Ser His Ala Gly Gin Gin His Ala Pro Ala Phe Gly Gin Ala 

15 10 15 

Ala Arg Ala Ser Gly Pro Thr Asp Gly Arg Ala Ala Ser Arg Pro Ser 
30 20 25 30 

His Arg Gin Gly Ala Ser Asp Pro Glu Leu Pro Thr Leu Leu Arg Val 

35 40 45 

Tyr He Asp Gly Pro His Gly Val Gly Lys Thr Thr Thr Ser Ala Gin 
50 55 60 

35 Leu Met Glu Ala Leu Gly Pro Arg Asp Asn He Val Tyr Val Pro Glu 
65 70 75 80 

Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr Leu Thr Asn 

85 90 95 

He Tyr Asn Thr Gin His Arg Leu Asp Arg Gly Glu He Ser Ala Gly 
40 100 105 HO 

Glu Ala Ala Val Val Met Thr Ser Ala Gin lie Thr Met Ser Thr Pro 

H5 120 125 

Tyr Ala Ala Thr Asp Ala Val Leu Ala Pro His He Gly Gly Glu Ala 
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130 135 140 

Val Gly Pro Gin Ala Pro Pro Pro Ala Leu Thr Leu Val Phe Asp Arg 
145 150 155 160 

His Pro lie Ala Ser Leu Leu Cys Tyr Pro Ala Ala Arg Tyr Leu Met 
5 165 170 175 

Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Met Pro Pro Thr 

180 185 190 

Ala Pro Gly Thr Asn Leu Val Leu Gly Val Leu Pro Glu Ala Glu His 
195 200 205 

10 Ala Asp Arg Leu Ala Arg Arg Gin Arg Pro Gly Glu Arg Leu Asp Leu 
210 215 220 

Ala Met Leu Ser Ala lie Arg Arg Val Tyr Asp Leu Leu Ala Asn Thr 
225 230 235 240 

Val Arg Tyr Leu Gin Arg Gly Gly Arg Trp Arg Glu Asp Trp Gly Arg 
15 245 250 255 

Leu Thr Gly Val Ala Ala Ala Thr Pro Arg Pro Asp Pro Glu Asp Gly 

260 265 270 

Ala Gly Ser Leu Pro Arg He Glu Asp Thr Leu Phe Ala Leu Phe Arg 
275 280 285 

20 Val Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu Tyr His He Phe Ala 
290 295 300 

Trp Val Leu Asp Val Leu Ala Asp Arg Leu Leu Pro Met His Leu Phe 
305 310 315 320 

Val Leu Asp Tyr Asp Gin Ser Pro Val Gly Cys Arg Asp Ala Leu Leu 
25 325 330 335 

Arg Leu Thr Ala Gly Met He Pro Thr Arg Val Thr Thr Ala Gly Ser 

340 345 350 

He Ala Glu He Arg Asp Leu Ala Arg Thr Phe Ala Arg Glu. Val Gly 
355 360 365 

30 Gly Val 
370 

(2) INFORMATION FOR SEQ ID NO: 50: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
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Val Leu Arg Val Val Asp Val Arg Gin Gly Leu Gly Gly Pro Gin His 

1 5 .10 15 

Leu Pro Val Ser His Arg Leu Gly Asp Val Asp Asp lie Val Ala Arg 
5 20 25 30 

Pro Gin Gly Leu His Gin Leu Arg Gly Gly Gly Gly Leu Pro His Pro 

35 40 45 

Val Gly Ser Val Tyr He Asn Pro Gin Gin Arg Gly Gin Leu Arg He 
50 55 60 

10 Pro Ala Gly Phe Gly Gly Pro Leu Ala Met Ala Arg Thr Gly Arg Arg 
65 70 75 80 

Ala Ala Val Gly Arg Pro Ala Arg Thr Ser Ser Leu Thr Glu Arg Arg 

85 90 95 

Arg Val Leu Leu Ala Gly Val Arg Ser His Thr Arg Phe Tyr Lys Ala 
15 100 105 HO 

Phe Ala Arg Glu Val Arg Glu Phe Asn Ala Thr Arg He Cys Gly Thr 

115 120 125 

Leu Leu Thr Leu Met Ser Gly Ser Leu Gin Gly Arg Ser Leu Phe Glu 
130 135 140 

20 Ala Thr Arg Val Thr Leu He Cys Glu Val Asp Leu Gly Pro Arg Arg 
145 150 155 160 

Pro Asp Cys He Cys Val Phe Glu Phe Ala Asn Asp Lys Thr Leu Gly 

165 170 175 

Gly Val Cys Val He Leu Lys Thr Cys Lys Ser He Ser Ser Gly Asp 
25 180 185 190 

Thr Ala Ser Lys Arg Glu Gin Arg Thr Thr Gly Met Lys Gin Leu Arg 

195 200 205 

His Ser Leu Lys Leu Leu Gin Ser Leu Ala Pro Pro Gly Asp Lys Val 
210 215 220 

30 Val Tyr Leu Cys Pro He Leu Val Phe Val Ala Gin Arg Thr Leu Arg 
225 230 235 240 

Val Ser Arg Val Thr Arg Leu Val Pro Gin Lys He Ser Gly Asn He 

245 250 255 

Thr Ala Ala Val Arg Met Leu Gin Ser Leu Ser Thr Tyr Ala Val Pro 
35 260 265 270 

Pro Glu Pro Gin Thr Arg Arg Ser Arg Arg Arg Val Ala Ala Thr Ala 

275 280 285 

Arg Pro Gin Arg Pro Pro Ser Pro Thr Arg Asp Pro Glu Gly Thr Ala 
290 295 300 

40 Gly His Pro Ala Pro Pro Glu Ser Asp Pro Pro Ser Pro Gly Val Val 
305 310 315 320 

Gly Val Ala Ala Glu Gly Gly Gly Val Leu Gin Lys He Ala Ala Leu 
325 330 335 
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Phe Cys Val Pro Val Ala Ala Lys Ser Arg Pro Arg Thr Lys Thr Glu 
340 345 350 

(2) INFORMATION FOR SEQ ID NO: 51: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 514 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

15 

Met Asp Pro Tyr Tyr Pro Phe Asp Ala Leu Asp Val Trp Glu His Arg 

1 5 10 15 

Arg Phe He Val Ala Asp Ser Arg Ser Phe He Thr Pro Glu Phe Pro 
20 25 30 

20 Arg Asp Phe Trp Met Leu Pro Val Phe Asn He Pro Arg Glu Thr Ala 
35 40 45 

Ala Glu Arg Ala Ala Val Met Gin Ala Gin Arg Thr Ala Ala Ala Ala 

50 55 60 

Ala Leu Glu Asn Ala Ala Leu Gin Ala Ala Glu Leu Pro Val Asp He 
25 65 70 75 80 

Glu Arg Arg He Arg Pro He Glu Gin Gin Val His His He Ala Asp 

85 90 95 

Ala Leu Glu Ala Leu Glu Thr Ala Ala Ala Ala Ala Glu Glu Ala Asp 
100 105 no 

30 Ala Ala Arg Asp Ala Glu Arg Glu Gly Ala Ala Asp Gly Ala Ala Pro 
115 120 125 

Ser Pro Thr Ala Gly Pro Ala Ala Ala Glu Met Glu Val Gin He Val 

130 135 140 

Arg Asn Asp Pro Pro Leu Arg Tyr Asp Thr Asn Leu Pro Val Asp Leu 
35 145 150 155 " 160 

Leu His Met Val Tyr Ala Gly Arg Gly Ala Ala Gly Ser Ser Gly Val 

165 170 175 

Val Phe Gly Thr Trp Tyr Arg Thr He Gin Glu Arg Thr He Ala Asp 
180 185 190 

40 Phe Pro Leu Thr Thr Arg Ser Ala Asp Phe Arg Asp Gly Arg Met Ser 
1^5 200 205 

Lys Thr Phe Met Thr Ala Leu Val Leu Ser Leu Gin Ser Cys Gly Arg 
210 215 220 
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Leu Tyr.Val Gly Gin Arg His Tyr Ser Ala Phe Glu Cys Ala Val Leu 
225 230 235 240 

Cys Leu Tyr Leu Leu Tyr Arg Thr Thr His Glu Ser Ser Pro Asp Arg 
245 250 255 

5 Asp Arg Ala Pro Val Ala Phe Gly Asp Leu Leu Ala Arg Leu Pro Arg 
260 265 270 

Tyr Leu Ala Arg Leu Ala Ala Val lie Gly Asp Glu Ser Gly Arg Pro 

275 280 285 

Gin Tyr Arg Tyr Arg Asp Asp Lys Leu Pro Lys Ala Gin Phe Ala Ala 
10 290 295 300 

Ala Gly Gly Arg Tyr Glu His Gly Ala Thr His Val Val He Ala Thr 
305 310 315 320 

Leu Val Arg His Gly Val Leu Pro Ala Ala Pro Gly Asp Val Pro Arg 
325 330 335 

15 Asp Thr Ser Thr Arg Val Asn Pro Asp Asp Val Ala His Arg Asp Asp 
340 345 350 

Val Asn Arg Ala Ala Ala Ala Phe Leu Arg His Asn Leu Phe Leu Trp 

355 360 365 

Glu Asp Gin Thr Leu Leu Arg Ala Thr Ala Asn Thr He Thr Ala Val 
20 370 375 380 

Leu Arg Arg Leu Leu Ala Asn Gly Asn Val Tyr Ala Asp Arg Leu Asp 
385 390 395 400 

Asn Arg Leu Gin Leu Gly Met Leu lie Pro Gly Ala Val Pro Ala Glu 
405 410 415 

25 Ala He Arg Ala Ser Gly Leu Asp Ser Gly Ala He Lys Ser Gly Asp 
420 425 430 

Asn Asn Leu Glu Ala Leu Cys Val Asn Tyr Val Leu Pro Leu Tyr Gin 

435 440 445 

Ala Asp Pro . Thr Val Glu Leu Thr Gin Leu Phe Pro Gly Ala Gly Arg 
30 450 455 460 

Pro Val Pro Gly Arg Pro Gly Gly Ala Ala Thr Gly Val Asp Glu Arg 
465 470 475 480 

Gly Tyr Val Val Gly Arg Pro Pro Gly Gly Ala Arg Ala Pro His* Arg 
485 490 495 

35 Ala Gly Ala His Gin Pro His Pro His Lys His His Pro Cys Gly Gly 
500 .505 510 

Asp Tyr 



40 (2) INFORMATION FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 91 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Val Val Asp Met Leu Ser Gly Ala Arg Gin Ala Ala Leu Val Arg Leu 
10 1 5 10 15 

Thr Ala Leu Glu Leu lie Asn Arg Thr Arg Thr Asn Thr Thr Pro Val 

20 25 30 

Gly Glu lie lie Asn Ala His Asp Ala Leu Gly He Gin Tyr Glu Gin 
35 40 45 

15 Gly Leu Gly Leu Leu Ala Gin Gin Ala Arg He Gin Ala Lys Arg Phe 
50 55 60 

Ala Thr Phe Asn Val Gly Ser Asp Tyr Asp Leu Leu Tyr Phe Leu Cys 
65 70 75 80 

Leu Gly Phe He Pro Gin Tyr Leu Ser Val Ala 
20 85 90 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 444 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Val Arg Val Pro Met Ala Ser Ala Glu Met Arg Glu Arg Leu Glu Ala 
35 1 5 10 15 

Pro Leu Pro Asp Arg Ala Val Pro He Tyr Val Ala Gly Phe Leu Ala 

20 25 30 

Leu Tyr Asp Ser Gly Asp Pro Gly Glu Leu Ala Leu Asp Pro Asp Thr 
35 40 45 

40 Val Arg Ala Ala Leu Pro Pro Glu Asn Pro Leu Pro He Asn Val Asp 
50 55 60 

His Arg Ala Arg Cys Glu Val Gly Arg Val Leu Ala Val Val Asn Asp 
65 70 75 80 
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Pro Arg Gly Pro Phe Phe Val Gly Leu lie Ala Cys Val Gin Leu Glu 

85 90 95 

Arg Val Leu Glu Thr Ala Ala Ser Ala Ala lie Phe Glu Arg Arg Gly 
100 105 110 

5 Pro Ala Leu Ser Arg Glu Glu Arg Leu Leu Tyr Leu lie Thr Asn Tyr 
115 120 125 

Leu Pro Ser Val Ser Leu Ser Thr Lys Arg Arg Gly Asp Glu Val Pro 

130 135 140 

Pro Asp Arg Thr Leu Phe Ala His Val Cys Ala lie Gly Arg Arg Leu 
10 145 150 155 160 

Gly Thr lie Val Thr Tyr Asp Thr Ser Leu Asp Ala Ala lie Ala Pro 

165 170 175 

Phe Arg His Leu Asp Pro Ala Thr Arg Glu Gly Val Arg Arg Glu Ala 
180 185 190 

15 Ala Glu Ala Glu Leu Ala Gly Arg Thr Trp Ala Pro Gly Val Glu Ala 
195 200 205 

Leu Thr His Thr Leu Leu Ser Thr Ala Val Asn Asn Met Met Leu Arg 

210 215 220 

Asp Arg Trp Ser Leu Val Ala Glu Arg Arg Arg Gin Ala Gly He Ala 
20 225 230 235 240 

Gly His Thr Tyr Leu Gin Ala Ser Glu Lys Phe Lys He Trp Gly Ala 

245 250 255 

Glu Ser Ala Pro Ala Pro Glu Arg Gly Tyr Lys Thr Gly Ala Pro Gly 
260 265 270 

25 Ala Met Asp Thr Ser Pro Ala Ala Ser Val Pro Ala Pro Gin Val Ala 
275 280 285 

Val Arg Ala Arg Gin Val Ala Ser Ser Ser Ser Ser Ser Ser Ser Phe 

290 295 300 

Pro Ala Pro Ala Asp Met Asn Pro Val Ser Ala Ser Gly Ala Pro Ala 
30 305 310 315 320 

Pro Pro Pro Pro Gly Asp Gly Ser Tyr Leu Trp He Pro Ala Phe His 

325 330 335 

Tyr Asn Gin Leu Val Thr Gly Gin Ser Ala Pro His His Pro Pro Leu 
340 345 350 

35 Thr Ala Cys Gly Leu Pro Ala Ala Gly Thr Val Ala Tyr Gly His Pro 
355 360 365 

Gly Ala Gly Pro Ser Pro His Tyr Pro Pro Pro Pro Ala His Pro Tyr 

370 375 380 

Pro Gly Tyr Ala Val Arg Gly Pro Gin Ser Pro Gly Gly Pro Asp Arg 
40 385 390 395 400 

Arg Ala Gly Gly Gly His Arg Arg Arg Pro Pro Gly Gly Trp Ala Ser 

405 410 415 

Gly Gly Arg Arg Arg Pro Arg Asp Pro Gly Val Gly Glu Pro Pro Pro 
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420 425 430 

Thr Arg Gly Gly Ala Ala Gly Val Arg Leu Arg Pro 
435 440 

5 (2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 250 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Leu Phe Ala Gly Pro Ser Pro Leu Glu Ala Gin lie Ala Ala Leu 

1 5 10 15 

Val Gly Ala lie Ala Ala Asp Arg Gin Ala Gly Gly Leu Pro Ala Ala 
20 20 25 30 

Ala Gly Asp His Gly lie Arg Gly Ser Ala Asn Arg Arg Arg His Glu 

35 40 45 

Val Glu Gin Pro Glu Tyr Asp Cys Gly Arg Asp Glu Pro Asp Arg Asp 
50 55 60 

25 Phe Pro Tyr Tyr Pro Gly Glu Ala Arg Pro Glu Pro Arg Pro Val Asp 
65 70 75 80 

Ser Arg Arg Ala Ala Arg Gin Ala Ser Gly Phe Thr He Thr Ala Leu 

85 90 95 

Val Gly Ala Val Thr Ser Leu Gin Gin Glu Leu Ala His Met Arg Ala 
30 100 105 HO 

Arg Thr His Ala Pro Tyr Gly Pro Tyr Pro Pro Val Gly Pro Tyr His 

115 120 125 

His Pro His Ala Asp Thr Glu Thr Pro Ala Gin Pro Pro Arg Tyr Pro 
130 135 140 

35 Ala Glu Ala Val Tyr Leu Pro Pro Pro His lie Ala Pro Pro Gly Pro 
145 150 155 160 

Pro Leu Ser Gly Ala Val Pro Pro Pro Ser Tyr Pro Pro Val Ala Val 

165 170 175 

Thr Pro Gly Pro Ala Pro Pro Leu His Gin Pro Ser Pro Ala His Ala 
40 180 185 190 

His Pro Pro Pro Pro Pro Pro Gly Pro Thr Pro Pro Pro Ala Ala Ser 

195 200 205 

Leu Pro Gin Pro Glu Ala Pro Gly Ala Glu Ala Gly Ala Leu Val Asn 
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210 215 220 

Ala Ser Ser Ala Ala His Val Asn Val Asp Thr Ala Arg Ala Ala Asp 
225 230 235 240 

Leu Phe Val Ser Gin Met Met Gly Ser Arg 
5 245 250 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1161 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



CCTGGCACCT TGTAGCAGTC CACCTATAGG ATACCCCAGC ACTTTGGGAG GCCGAGGCAG 60 

ACGGATCACA AGTTCAGGAG ATCGAAACCA TCCTGGCCAA CATGGTAAAA CCCCGTCTCT 120 

20 ACTAAAAACG TAAAAATTAN CTGAGTGTGG TGGTGTGTGC CTGTACTCCC AGCTACTTGG 180 

GAGGCTGAGG CAAAAAATTC ACTTGAACCT GGGAGGTGGA GGTTGCACTG AGCTGAGGTC 240 

ATGCCACTGC ACTCCAGCCT AGCAACAGAG TGAGACTCCA TCTCAAAAAA ATAAATAAAT 300 

AAATAAATAA ATAAATAAAG ACATATGGAG GCCTTACTCT GTGCCAAGCA CTATGATGGG 360 

CACAGGGAAC AAACACACGG GCTCCCTGAG CACCAGCGGT GAGCCAGGCA CCGTGCCTGG 420 

25 AGACCAACGT CTGGCGTTTT GTATGCGGAC ATGATACCCG GCACTCTCCC CTATGGCTAA 480 

TGAATCATCG AGCTTCACCA GAGAAACGCG AACAGACCCC CTTGTCCATG AATTTGCTAA 540 

TTGACCTCCC CCAACTCAGA CATCAACCCT GCATTACCAT AAATTACTGG CTAAGAAACA 600 

CCCGTTCTCA ACCTGCTGGC CTCAATGGGT. TACACGTCCC ACAAACCCCT TTCCCAAGGT 660 

GAAATACCAA CCTCGAAAAC TCGGAAAATT CAAGTTCAAC AACCTCCGGG ATTGCGGGCT 720 

30 TTACCAGCGA AGCCCTCTCC AAAAATTTGC TCGCTTAGAC ATCCAACCGC TTCTCCACTA 780 

AAGACCCCGG CTTCTTCTCA CCTCGGCGTT CTCTTGCAAA AAATACGCGT CTGTTAATCC 840 

GCGCCCCTCT TCCTCACACA CTTCTCCCCT GCCTACTCAT ACCTCATCTC TCCTATAACC 900 

CTCTCGCGAA AGAGCCCCTG TCTCTCACCT GTTTAATACC ACCCATGCGG GCGGGTTGTC 960 

CCTTAAATGG ATATTCTGAA CCTCAACCCT CCCCCCATTC TAATTTGGCG TGTTGGCCCC 1020 

35 CTCACCTGCC CCCTCCTCTC CCAAAGTCCG GGAATATCCG TTCCCTGGCC ACCCTCCTTC 1080 

TTTATGAAAC CCGGCTTACC CCCCCGGGGA AATAGGCCGT TTGGCTTTTG TGGCGCGCCC 1140 

TTCCACCTTC CCCTCTACAA 1161 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Val Lys Tyr Gin Pro Arg Lys Leu Gly Lys Phe Lys Phe Asn Asn Leu 
15 10 15 

10 Arg Asp Cys Gly Gin Arg Ser Pro Leu Gin Lys Phe Ala Arg Leu Asp 
20 25 30 

lie Gin Pro Leu Leu His, 
35 

15 (2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 524 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 57: 

25 . 



GCAGGTCGAT 


CTAGAAGTCC 


CCAGGGGTCA 


GGGGTCTCAC 


TTGAGAAGGT 


AGTCTGTCCG 


60 


TTCTCCAATC 


TCAACCTCCG 


TGTTGGGAGA 


TCCACTGCTC 


ACTTCAAAGC 


TGTGAGACAG 


120 


AGTTGTTTGC 


GTCTGCAGAG 


GTTTCAGCTG 


CTTTTTGTTG 


TTGTAGTCGT 


CGTCGTTGTT 


180 


GTTGTTGTTG 


TTGTTTAGCT 


GTGCCCTGTC 


CCCAGAGGTG 


GAGTCTACAG 


AGACAGGCAG 


240 


GACTCCTTGA 


GCTGCTGTGA 


GCTCCACCCA 


GTTGGAGCTT 


CCCAGCTGCT 


TTGTTTATCT 


300 


ACTTAAGCCT CAGTAATGGC GGGCGCCCCT 


CCCCCAACCT 


CGCTGCTGCC 


TTGCCCCCAG 


360 


ATCGCAGACT 


GCTGTGCTAA 


CAACGAGGGA 


GGCCCTGTGG 


GCATGGGACC 


CTCCTGGCCA 


420 


NGTGTGGGAT 


ATANTCTCCT 


GGTTGTGCCC 


GTTGGTAAAA 


TTTCTGGGTA 


AACCCCATAT 


480 


TGGGGGTTTG 


AATTCCCCAA 


ATTTCCCAGT 


TTGTTTTGTG 


TCT 




524 



35 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 76 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

5 Val Glu Leu Thr Ala Ala Gin Gly Val Leu Pro Val Ser Val Asp Ser 
15 10 15 

Thr Ser Gly Asp Arg Ala Gin Leu Asn Asn Asn Asn Asn Asn Asn Asp 

20 25 30 

Asp Asp Tyr Asn Asn Lys Lys Gin Leu Lys Pro Leu Gin Thr Gin Thr 
10 35 40 45 

Thr Leu Ser His Ser Phe Glu Val Ser Ser Gly Ser Pro Asn Thr Glu 

50 55 60 

Val Glu lie Gly Glu Arg Thr Asp Tyr Leu Leu Lys 
65 70 75 



15 



40 



(2) INFORMATION FOR SEQ ID NO: 59: 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 773 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



GGAAGAGGGC GCTGTTGCCC GCGCTCCTTG CGCGGTGGCG GCGGGGGGCA GGCGGAGGCA 60 

. GGCGCGGCGT GCGGGGCCTC CGGCGCCTTC CCCCCGCCCT CGCTCGGGGG GCTGTTCGCC 120 

CACTCTGCGT CGTCGTTGCC GGCGTAATCC GCGTCGTCGC TGTCGTCCGC CTGGGGCACC 180 

30 AGCAGCCAGC GCCGCAGGAG CGATGACGCG GCCGGCGCGC TCTCGACCGC GGTTCCCGAG 240 

TCGTACGCAG GGACCATTTG GGAGTCTGCG GTTGGGAACG CGCCGGGGCG CGGCACGGTT 300 

GGACCGCCGG GGCGCGGCCG GCGCCGGGGA CCCCGGCGGC GGGGACTCCG GCGGGACATG 360 

GAGGGCGGCT GGGCTCGGCC TATGCCCGGA TCCGGATCGC GTCTGGGCGG GAGATTTCAC 420 

TCGGCACGCA TGCACGTCTC CCCCCCCCCC CGTGGTTGCC TATGAAACTA CCCCGTCCCG 480 

35 CTGGTGTGCG CATTTCTGTC CGCGTTGCCG GCCTTCTTTG CGGCGCGTGG CTTGACTGGG 540 

ATCCCCTCCC CTCTCCCTTC CCCTCCGGGA TTCACCCCCG GGGGGGGTTT TTCTGGGGGG 600 

GGGGTTAATA GCTGTCTGTC CCCTCCCCAC CGTTTCCTCC CTGGACTCCA CGGCGCTCCA 660 

TAACTCTCTC CTGGTCCACC CCCCATTCCC CACATGGCCT TTGCTTTTCA ACCCCCCCCT 720 

CCGGTTGGGC TGCATATCAA TTTCCTTCTC CCCCGGGGGA TCCCCTATTA CG 773 



(2) INFORMATION FOR SEQ ID NO: 60: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 121 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

10 Met Ser Arg Arg Ser Pro Arg Arg Arg Gly Pro Arg Arg Arg Pro Arg 
15 10 15 

Pro Gly Gly Pro Thr Val Pro Arg Pro Gly Ala Phe Pro Thr Ala Asp 

20 25 30 

Ser Gin Met Val Pro Ala Tyr Asp Ser Gly Thr Ala Val Glu Ser Ala 
15 35 40 45 

Pro Ala Ala Ser Ser Leu Leu Arg Arg Trp Leu Leu Val Pro Gin Ala 

50 55 60 

Asp Asp Ser Asp Asp Ala Asp Tyr Ala Gly Asn Asp Asp Ala Glu Trp 
65 70 75 80 

20 Ala Asn Ser Pro Pro Ser Glu Gly Gly Gly Lys Ala Pro Glu Ala Pro 

85 90 95 

His Ala Ala Pro Ala Ser Ala Cys Pro Pro Pro Pro Pro Arg Lys Glu 

100 105 110 

Arg Gly Gin Gin Arg Pro Leu Pro Xaa 
25 115 120 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 981 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



CCCGGCGGAG 


AATGAGCCCA 


CACCCAGATT 


GTTCACCGCC 


CCCCATTTCC 


CCCCCCCCGG 


60 


GTATACACCN 


AGGGAAAAGG 


TTTTTCCCCC 


CCCCCCCGGA 


TCAAATTTCC 


CCCACAAGAA 


120 


CCGAGTTCCA 


GGTTAAGTTT 


AGTTGGGTGC 


CCTTCCCAGG 


TTGACGGGGG 


TCGCCAATGT 


180 


CCCAGCGGGG 


GTTGGCGCCC 


TCAGGGGGGG 


NGGGGCCAGC 


CCCCGCGGGC 


GGTCGCCCAC 


240 


CAACTTCCAA 


GCCGCGGCCC 


GCCGAGGCCA 


GCACGGTCCC 


CGGGGGGCCG 


GTGGCAGACG 


300 


CCCAGCGTAT 


CTGCGGGGGC 


GGGCCCGCGT 


CCGCGTCGTC 


GCGCAGCACC 


AGCGGGGGCG 


360 
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CGTCGCCGTC GGGCTAGAGC AGCGCCCGCG CGCAGAACTC CCGCCGCGGC CCGAGCAGAT 420 

CAGCCGGGCC GCCGCGCACG GTGTCGCGCC CCAGCGCCAC GTAGACGGGC CGCAGCGGCG 480 

CGCCCAGGCC CCAGCGCGCG CAGGCGCGGT GCGAGTGCGC CTCGTCCTCG CAGAAGTCCG 540 

GCGCGCCGGG CGCCATGGCG TCGCCCGCGC CCGAGGCGGC GGCCCGGCCG TCCAGCGCCG 600 

5 GGAGCACGGC GCGGCGGTAC TCGCGCGGGG ACATGGGCAC CAGCGTGTCG GGGCCGAAGC 660 

GCGTGCGCAC GCGGTACCGC ACGTTGGCCC CGCGGCAGAG GCGCAGCGGC GGCGCGTCGG 720 

GGTACATGCG CGCGTGCGCG GTCTCCACGC GCGCGAATAC CCCGGCCCTA ACACTCTGCC 780 

GGATGCCATC ACGGTGCTGC GCTTGTTCCG CGCCCCCGGT CTTCGCACGG CGCTCTGTCT 840 

TGGCGGGCTC CTCCTCCCTA GGTTATTTTT GGGTTCTTTC CTCTAAAAAA CCCGGGGCCT 900 

10 CTTTTGGGGG GGCCTTTTCC TCCCGGTCCC CTCCCCGGTT TGTGAACCAA CTAAATATAG 960 

GCCGGTGGTT CCCCCAGGCC 981 

(2) INFORMATION FOR SEQ ID NO: 62: 



(A) LENGTH: 122 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: 

25 Val Glu Thr Ala His Ala Arg Met Tyr Pro Asp Ala Pro Pro Leu Arg 
15 10 15 

Leu Cys Arg Gly Ala Asn Val Arg Tyr Arg Val Arg Thr Arg Phe Gly 

20 25 30 

Pro Asp Thr Leu Val Pro Met Ser Pro Arg Glu Tyr Arg Arg Ala Val 
30 35 40 45 

Leu Pro Ala Leu Asp Gly Arg Ala Ala Ala Ser Gly Ala Gly Asp Ala 

50 55 60 

Met Ala Pro Gly Ala Pro Asp Phe Cys Glu Asp Glu Ala His Ser His 
65 70 75 80 

35 Arg Ala Cys Ala Arg Trp Gly Leu Gly Ala Pro Leu Arg Pro Val Tyr 

85 90 95 

Val Gly Arg Asp Thr Val Arg Gly Gly Pro Ala Asp Leu Leu Gly Pro 

100 105 HO 

Arg Arg Glu Phe Cys Ala Arg Ala Leu Leu 
40 115 120 

(2) INFORMATION FOR SEQ ID NO: 63: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 644 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

10 GCATGCCTGC AGGTCGACTC TAAAGGATCC CCCAGCTGCC TCTCCCNTGG AAGACATATA 60 

TCTTCTTGTG CCGGACCAGC TTCAGTGGGG CCAGTGGGCC CTCTAGGGCA ACGGTCACTT 120 

GTATCGTGGC ATCCGTGTGC ACTGGCCCCA TTTCCATTGA AAAAATCAGG GTGTCACGGG 180 

TTCTGAGGCT GCCATTGAGG CGGTAAAAAA TGGCCCCGTC TAGCAGGTCC TCCTGGGAGA 240 

AGGCTGTGAC TGACTGAGTG GCCCAAAGCA ACTGCCCCCA GCGACGGCCG GCTGTGACAT 300 

15 GGTAGTGCAC CTCATCCCCA CTGCGGATGT CAAGGTTGGT GTCCANGTGG AGCTCANCTG 360 

TGTCAATGGT GCCCTGACCT CCTTGAGGGA CCACAAGGCC AAAACCGTTG GCCACACGGA 420 

TGTAGGGCTC CGAGGCCTGC ACCTCCANCA NCACAGTGGC CTGGTGTTGC CCATCGGACA 480 

CACCTGCAGC TGGATCCAGC CATGATCAGC CCAAGTGCAT GAACAGGACT CGCCTCTTTC 540 

TGAGGTCCTC CTGGGTGAAG CGGTAAATGG GCCACGTGGG CTCATCCACG GCCACGATAC 600 

20 TGCCAAAAAA GAAGTCCTTG CGGGTCAGGG TACCGANCTC AAA 644 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 151 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Val Ser Asp Gly Gin His Gin Ala Thr Val Xaa Xaa Glu Val Gin Ala 
35 1 5 10 15 

Ser Glu Pro Tyr lie Arg Val Ala Asn Gly Phe Gly Leu Val Val Pro 

20 25 30 

Gin Gly Gly Gin Gly Thr lie Asp Thr Xaa Glu Leu His Xaa Asp Thr 
35 40 45 

40 Asn Leu Asp He Arg Ser Gly Asp Glu Val His Tyr His Val Thr Ala 
50 55 60 

Gly Arg Arg Trp Gly Gin Leu Leu Trp Ala Thr Gin Ser Val Thr Ala 
65 70 75 80 
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Phe Ser Gin Glu Asp Leu Leu Asp Gly Ala He Phe Tyr Arg Leu Asn 

85 90 95 

Gly Ser Leu Arg Thr Arg Asp Thr Leu He Phe Ser Met Glu Met Gly 
100 105 110 

5 Pro Val His Thr Asp Ala Thr He Gin Val Thr Val Glu Gly Pro Leu 
115 120 125 

Ala Pro Leu Lys Leu Val Arg His Lys Lys He Tyr Val Phe Xaa Gly 

130 135 140 

Arg Gly Ser Trp Gly lie Leu 
10 145 150 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Val Glu Leu Xaa Cys Val Asn Gly Ala Leu Thr Ser Leu Arg Asp His 
25 1 5 10 15 

Lys Ala Lys Thr Val Gly His Thr Asp Val Gly Leu Arg Gly Leu His 

20 25 30 

Leu Xaa Xaa His Ser Gly Leu Val Leu Pro He Gly His Thr Cys Ser 
35 40 45 

30 Trp He Gin Pro 
50 

(2) INFORMATION FOR SEQ ID NO: 66: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 585 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
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GCTCAATCCT CGAATTCAGA AAACAGTTGC CATTTATCCC TTCTTGTCAG ACTTCAGACG 60 

GGTGATTGAG ATTGGTAATA CTAGCGAGGC TTACGACGAA CTTTTCCGTT ATTTCAAGTT 120 

TCACGACCCC TTCCATGAAA CAGAGGAGGA AATCATGGCG ACCCTTGCCT ATATCGATGT 180 

CAAAAATCTT GCCCATCGTA TCCAAGGTGA GGTTAAAATG ATTACGGGCT TGGACAACAA 240 

5 TGTTTGCTAT CCCATTACCC AGTTTGCGAT TTATAATCGT CTGACCTGCG ATAAAACCTA 300 

TCGCATCATG CCTGAGTATG CTCACGAAGC CATGAATGTA TTTGTCAATG ACCAAGTCTA 360 

CAACTGGCTC TGTGGAAGTG AGATTCCTTT TAAATATCTA AAATAAGGAG TCGACTCTAA 420 

GCACAAAATC TTAAAAATTA CAAACACGCA TAGTATCAGG GGATTAAAAA AACTTGATAC 480 

TATGCGTTTT ATCATGGACA TATATTATAA TGAAACAAGA ACAGGACAAA TCGATCCGGA 540 

10 CAGTCCAATC GATTTCTAAC AATGTTTTAA AAGTAAATGT GTCT 585 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 



(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Met lie Thr Gly Leu Asp Asn Asn Val Cys Tyr Pro lie Thr Gin Phe 
25 1 5 10 15 

Ala lie Tyr Asn Arg Leu Thr Cys Asp Lys Tyr lie Met Pro Glu Tyr 

20 25 30 

Ala His Glu Ala Met Asn Val Phe Val Asn Asp Gin Val Tyr Asn Trp 
35 40 45 

30 Leu Cys Gly Ser Glu lie Pro Phe Lys Tyr Leu Lys 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 68: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1237 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

273 



WO 98/20016 



PCT/US97A20016 



ACTGTATCGA TAAGCTTGAT CCATGGCGGT GGCCGACACA GGGAGGGGCG TCTTCTCCGG 60 

AGCGGACGTA GGCGAATCGG AGGACGCGTC CTCGGTCAAG GCCATGAGGC GCCGCCCGGT 120 

TAGGGGGGCC CGAACGTCGG GGTCAACCCC CTCGGGGTCT GTCCGCAGGG CGCGTCAAAA 180 

CCGCGGGCGG GGTGGGAAGG GGGCGTACGG ACCGTCATCT AGGGCCCCGG GGGCCCAATG 240 

5 GGGTGGCAGG ACCCCGACGT CTTCCGTGGG TCGTGCCATC CGAATAAACG TGCGGCCCGT 300 

AATCCCCACC AGCAGGCTCT GGGGAGCAAA ACCGACGCGT GGTAGGTCGC TGGGGGCGGC 3 60 

GGGCGTCTGT GGGGGCAAAC AGCGCTCCCG GAAACGCAGG CCACAAAACC CGGGGTTGGG 420 

GGCGGAATAC CATACCGGGG GCACCTATCG CCACGGGCGG CCCGCGGGGA CCGGGGGGAC 480 

TCACGGGCCG CCCTCCGCAC. GCGCCTCCTG TGGGGGGGCG GTGGGGTTTT CTGCCTATTC 540 

10 CCTTCCTTTC CTCATCCTCT TCCTTCCCTA CTTCCCCCCT TCTCATTTCT CCTCTCTTCT 600 

GTTCACCCCT TACTCCGCTT CACTTGCTCT CTCTCTATTC ATTCCGTCCT CTACTTTTCT 660 

CGTCCTTTCC TCTCTCTCCC CTCATCTATC TTCTCTTCAT CTCTTTTTCT CTCGCTCCCT 720 

CTCTTTCCCA TCTTCCGTTC TTCTCTGACT TTCATATCAT TCTTGCCTCA CCCCGACACT 780 

CGCTATTCTC TCTTTCTCGC CACCAATCCT GTGTGTGTTT CTCGTTTTCT TCACACCTCG 840 

15 TTCTATAGCT CACCACATCA TGTGCTTTCT CGTATCTCCT ATCCTCCTTA TCCTTCTCTT 900 

TTCTTTCTCT CTCAACCGCT CCCTTCTGTT CCACAGACAC TCTCTCTGCT CTCTCTCATT 960 

CTTGCGCTCT TGTATTCACT TCTCATCATT CTTACACTTT TCTCTCTCAT TCGCACCCAT 1020 

CTACCGCTAC GTCATTCACA CCGCGATTTT TTCTAGCTCT ACCTATTCCT CCTCGACTTC 1080 

TCTGTGCGAC TATACTCCCC TCTTCTTTCT GTCCTACACG TCTGAGATCA CATTGATCTT 1140 

20 CCCTCACCCC TCTGCTCCTG ACTATACCTT ATTCTATTTA TTTCTCGACC CTTCCTTCCC 1200 

ATTCTCTATT CTCGACTTCT CTGCACCTCT CCTCAC 1237 

(2) INFORMATION FOR SEQ ID NO: 69: 



(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

35 Met Ala Leu Thr Glu Asp Ala Ser Ser Asp Ser Pro Thr. Ser Ala Pro 
15 10 15 

Glu Lys Thr Pro Leu Pro Val Ser Ala Thr Ala Met Asp Gin Ala Tyr 

20 25 30 

Arg Tyr Ser Xaa Xaa 
40 35 

(2) INFORMATION FOR SEQ ID NO: 70: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2057 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

10 GGCGACACGA GAAGAGAGCA AGGAGGGGAG AGCGACAGAG TAACTACACA TCGGAGCGCA 60 

TAATGAAGGA CAGTCATAAG ATGAGAGCAA GAAGAGATGA GCAGAGTCAG AAGATATAGC 120 

GACATAGAAG TCAGAGGCGA CACGGGGTAC GTGAGAATGT CGAACGCAGG ACGGGCGGAG 180 

GAGAGGAAGG GGGCCCGAGC GCAGAACGAA ACAGGGGAAC ACAGGAGAAC GGAAGGAGCG 240 

AGCAAAGGGG AACAGACAGA AGGGCGCGCA NAGCGGAACG CCGCGAGAGC CGAGAGCCAC 300 

15 AACACGGCCA ACGCGGCGAG AGCGACAGGG CAGACAGGGC AACGCGCCAG ACGGAAGCGG 360 

CGAGGGACGA AGCGNAGAGC NGGAGAGGCA AGGACAAAGC AAAGGAGGAG GAGAGCGCAC 420 

AGAAACGAAG AAGACGAGCG CGACAAGCCA CGGGAACGGA ACGAAACCGA GGCCACGGAG 480 

GGAGCAAAGC GAAGGGAGGC CTCCTTCCTC ATCATATTCG GCATCATCAT CGTCTTCATC 540 

CGCATATTCC TCTGCGGGCG GGGCTGGTGG GAGCGTCGCG TCCGCGTCCG GCGCTGGGGA 600 

20 GAGACGAGAA ACCTCCCTCG GCCCCCGCGC TGCTGCGCCG CGGGGGCCGA GGAAGTGTGC 660 

CAGGAAGACG CGCCACGCGG AGGGCGGCCC CGAGCCCGGG GCCCGCGACC CGGCGCCCGG 720 

CCTCACGCGC TACCTGCCCA TCGCGGGGGT CTCGAGCGTC GTGGCCCTGG CGCCTTACGT 780 

GAACAAGACG GTCACGGGGG ACTGCCTGCC CGTCCTGGAC ATGGAGACGG GCCACATAGG 840 

GGCCTACGTG GTCCTCGTGG ACCAGACGGG GAACGTGGCG GACCTGCTGC GGGCCGCGGC 900 

25 CCCCGCGTGG AGCCGCCGCA CCCTGCTCCC CGAGCACGCG CGCAACTGCG TGAGGCCCCC 960 

CGACTACCCG ACGCCCCCCG CGTCGGAGTG GAACAGCCTC TGGATGACCC CGGTGGGCAA 1020 

CATGCTCTTT GACCAGGGCA CCCTGGTGGG CGCGCTGGAC TTCCACGGCC TCCGGTCGCG 1080 

CCACCCGTGG TCTCGGGAGC AGGGCGCGCC CGCGCCGGCC GGCGACGCCC CCGCGGGCCA 1140 

CGGGGAGTAG GGGGAGCTAA CACTCGGCTT GCTGCCCGAA GGGAAGCCGC CCCCCACCGG 1200 

30 ACCACCGGCC GAGGCGCCTC GGGGGCATGG GGATGTGGGG GGGGGGGGAA AACNGGGATC 1260 

ATATCCGGAT TGCGGGTGGG ATTGGGGGGG GTATGTTTTT TGTTTNTTTT TGTTTTTTTT 1320 

TTTTTNTTTT GGTGTTGGTT TTTTTGGTTT TTGTTTTTTT TTNGGGGGAT TTTTGTTTTT 1380 

TTTTTTTTTT TTNTTTTTTC GTTTTTTTTT TTGTGTTTTT NTTTGGTNTT TGGTTTGTTT 1440 

TGTGTTTTTT TTTTTTTTNT TNTTTTTTTT TTTTGGGNTT TNTGTTTTTT TTGTTTGTTT 1500 

35 CTTTGTTTTT TTTTNTTTTG TTTCGTGTTT TTCTTTTTTT TTTCCTTCCT TTTCCCCCCG 1560 

CTTTCCCCCC CCTNCTCCCC CTCCTCTTCT CTTTCTCTNN TTTTCCTCTT CCCTTTTTCT 1620 

TCCCGTCTCC CCTCTGCGTT TCCCTCTCCC TTTTCTTCCC TTCCCGCTTC TCCGTCCCTC 1680 

CTCTTTTCCC TCCTTCCTCT TTCTTCCCCT GCTGCCTCCC TCTCTCCTCC TGTCCTTTTC 1740 

CCTCTTTTTC CCCTCCCTCT GCCCTTTCTT CCCTTCTCCT CTTCCCTCCC CTCCTTTCTT 1800 

40 TCCTTCCTCG CGTCGTTCCT CCCTCTTCTC TCTCTCTCTC CTCTNGTCCC CCCCCTCTTT 1860 

CTCTTCCCCC CTCTTTTTCT CTCCTCTCGT CCTCCTTCCC CCTCATTTTA GCCTCATCCC 1920 

CTCCATCCTA TTACTCCTCT ATTCTCCTCT CTCTCCCTCT TCCATCCCTT CCGCTCCTCC 1980 

CATTATTCCT CTAAGCTTGC CCTCCTCCAC CTTCTCTCTA TCTCAAGTCC TCCTCCCTCT 2040 
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10 



CACTATTCGG TTCCCT 2 057 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

15 Val Ala Pro Tyr Val Asn Lys Thr Val Thr Gly Asp Cys Leu Pro Val 
15 10 15 

Leu Asp Met Gly His He Gly Ala Tyr Val Val Leu Val Asp Gin Thr 

20 25 30 

Gly Asn Val Ala Asp Leu Leu Arg Ala Ala Ala Pro Ala Trp Ser Arg 
20 35 40 45 

Arg Thr Leu Leu Pro Glu His Ala Arg Asn Cys Val Arg Pro Pro Asp 

50 55 60 

Tyr Pro Thr Pro Pro Ala Ser Glu Trp Asn Ser Leu Trp Met Thr Pro 
65 70 75 80 

25 Val Gly Asn Met Leu Phe Asp Gin Gly Thr Leu Val Gly Ala Leu Asp 

85 90 95 

Phe His Gly Leu Arg Ser Arg His Pro Trp Ser Arg Glu Gin Gly Ala 

100 105 iio 

Pro Ala Pro Ala Gly Asp Ala Pro Ala Gly His Gly Glu 
30 115 120 125 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1468 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



GCACCGCCCG GAAAGGGATC CCGGGGGGAA CCCCGCCCCC GAGAGGCGAC CGGGGCAGAA 

276 
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CCCCCGGCAC GGTGAGAGGN GACCCCCGGT 
AGGAGGGGGG TTGGGGGTGT TGCGGGGCGT 
CAGACCCCCG CCCCGCTTAA GCGGTCGGGG 
CCCTTTGGGG GGGGCGAGGG AGGCAGGGAG 
5 GGGAGACGAG GGGTAGGAAT CCAAAGGACG 
CCCCCCCTTC CGAACAAAAA GCAGCGGGCG 
ACACGGCACG GGGGTCCCGC CTCACGCCCC 
TCAAGCAGCC CGCCGCCCCG CACGCCTGGG 
GCAGCGACTC GGAGGAGGAG ACCGAGGTGG 

10 CCACCTCCGA GGCGGGCAGC ACGGACACGG 
CCACGCCCCC GGCCCGGCCC CCGGCCGAGC 
AGGGATCCTG CGGGGGTGGG CCCGTGGGTG 
ACGTGTGTGC CGTGTGCACG GACGAGATCG 
GCCTGCACCC CTTCTGCATC CCGTGCATGA 

15 CCCTGTGCAA CACCCCGGTG GCGTACCTGA 
GCACCATCCC GATAGTGAAC GACCCCCGGA 
CCGGCACGGC CGTGGACTTT ATCTGGACGG 
CGCTGGGGGG ACACACGGTC CGCGCCCTGT 
ACGAGGACGA TGACCCGCCC GACGGTGAGG 

20 CCGGCACTGG GCGGGGGTCT GGCACTGGGC 
GGGGTCAGGC ACTAACCGGG GGTTCCCGTC 
CCCGCCCCCC CCCTAATACC TCCCCGCCCG 
CTCCACCCCC CCTTTTACCT AACCTGCGCC 
GTGGCCCCCC CTTCCTGGGC CGGGGTT 

25 

(2) INFORMATION FOR SEQ 



TATCAGGCCC CCCTTTTTCC CCGACCACCC 120 

GGGGTTTGGG GGCGGGGACG CTTGACGGGG 180 

GACCCCCATG GGCCGTGCGC CGCCCCCCGA 240 

CCCTGAGCCC GAGAGCGGGG GACAGGGGGG 300 

CAGACCACCT TTGGTTACGG ACCCCTTTCT 360 

GGGGGCCGGG GTGAGGGAGG GACACGGGGG 420 

GCGCCCTCTA AATCCCCCCC CGTTTCTTTG 480 

GGATGCTCAA CGACATGCAG TGGCTCGCCA 540 

GAATCTCTGA CGACGACCTT CACCGCGACT 600 

AGATGTTCGA GGCGGGCCTG ATGGACGCGG 660 

GCCAGGGCAG CCCCACGCCC GCCGACGCGC .720 

AGGAGGAAGC GGAAGCGGGA GGGGGGGGCG 780 

CCCCGCCCCT GCGCTGCCAG AGTTTTCCCT 840 

AGACCTGGAT TCCGTTGCGC AACACGTGTC 900 

TAGTGGGCGT GACCGCCAGC GGGTCGTTCA 960 

CCCGCGTGGA GGCCGAGGCG GCCGTGCGGT 1020 

GCAACCCGCG GACGGCCCCG CGCTCCCTGT 1080 

CGCCCACCCC CCCGTGGCCC GGCACGGACG 1140 

GCGGGCGGGG GTCTGGCACT GGGCGGGGGT 1200 

GGGGGTCCGG CACTGGGCGG GGGTCTGGCG 1260 

TCTGTCTCCC TCTGCAACCG GAACTAATTT 1320 

GGGCTGCTGT GCCGGGGCCA CCCCTGGTAA 1380 

CCCCGGCCCC CCCCGGGACT ACACTCACCC 1440 

1468 

ID NO:73: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 319 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 



Met Leu Asn Asp Met Gin Trp Leu Ala Ser Ser Asp Ser Glu Glu Glu 
1 5 10 15 

40 Thr Glu Val Gly lie Ser Asp Asp Asp Leu His Arg Asp Ser Thr Ser 
20 25 30 

Glu Ala Gly Ser Thr Asp Thr Glu Met Phe Glu Ala Gly Leu Met Asp 
35 40 45 
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Ala Ala Thr Pro Pro Ala Arg Pro Pro Ala Glu Arg Gin Gly Ser Pro 

50 55 60 

Thr Pro Ala Asp Ala Gin Gly Ser Cys Gly Gly Gly Pro Val Gly Glu 
65 70 75 80 

5 Glu Glu Ala Glu Ala Gly Gly Gly Gly Asp Val Cys Ala Val Cys Thr 

85 90 95 

Asp Glu lie Ala Pro Pro Leu Arg Cys Gin Ser Phe Pro Cys Leu His 

100 105 110 

Pro Phe Cys He Pro Cys Met Lys Thr Trp He Pro Leu Arg Asn Thr 
10 115 120 125 

Cys Pro Leu Cys Asn Thr Pro Val Ala Tyr Leu He Val Gly Val Thr 

130 135 140 

Ala Ser Gly Ser Phe Ser Thr He Pro He Val Asn Asp Pro Arg Thr 
145 150 155 160 

15 Arg Val Glu Ala Glu Ala Ala Val Arg Ser Gly Thr Ala Val Asp Phe 

165 170 175 

He Trp Thr Gly Asn Pro Arg Thr Ala Pro Arg Ser Leu Ser Leu Gly 

180 185 190 

Gly His Thr Val Arg Ala Leu Ser Pro Thr Pro Pro Trp Pro Gly Thr 
20 195 200 205 

Asp Asp Glu Asp Asp Asp Pro Pro Asp Gly Glu Gly Gly Arg Gly Ser 

210 215 220 

Gly Thr Gly Arg Gly Ser Gly Thr Gly Arg Gly Ser Gly Thr Gly Arg 
225 230 235 240 

25 Gly Ser Gly Thr Gly Arg Gly Ser Gly Gly Gly Gin Ala Leu Thr Gly 

245 250 255 

Gly Ser Arg Leu Cys Leu Pro Leu Gin Pro Glu Leu He Ser Arg Pro 

260 * 265 270 

Pro Pro Asn Thr Ser Pro Pro Gly Ala Ala Val Pro Gly Pro Pro Leu 
30 275 280 285 

Val Thr Pro Pro Pro Leu Leu Pro Asn Leu Arg Pro Pro Ala Pro Pro 

290 295 300 

Gly Thr Thr Leu Thr Arg Gly Pro Pro Phe Leu Gly Arg Gly Phe 
305 310 315 



35 



(2) INFORMATION FOR SEQ ID NO: 74: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 620 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



AAAACGCACG 


AGTATTGCAC 


GAATAACCAA 


CCAAACAACC 


ACTCAGACCA 


TGTGGATCCA 


60 


TACCCTTACT 


TGGCAAAATG 


GGGCATTAGC 


CGTGAGCAGT 


TTAAGCATGA 


TATTGAGAAC 


120 


GGCTTGACGA 


TTGAAACAGG 


CTGGCAGAAG 


AATGACACTG 


GCTACTGGTA 


CGTACACTCA 


180 


GACGGCTCTT 


ATCCAAAAGA 


CAAGTTTGAG 


AAAATCAATG 


GCACTTGGTA CTACTTTGAC 


240 


AGTTCAGGCT 


ATATGCTTGC 


AGACCGCTGG 


AGGAAGCACA 


CAGACGGCAA CTGGTACTGG 


300 


TTCGACAACT 


CAGGCGAAAT 


GGCTACAGGC 


TGGAAGAAAA 


TCGCTGATAA 


GTGGTACTAT 


360 


TTCAACGAAG 


AAGGTGCCAT 


GAAGACAGGC 


TGGGTCAAGT 


ACAAGGACAC 


TTGGTACTAC 


420 


TTAAACGCTA 


AAGAAGGCGC 


CATGGTATCA 


AATGCCTTTA 


TCCACTCAGC 


CGGACGGAAC 


480 


AGGCTGGTAC 


TACCTCAAAC 


CAGACCGAAC 


ACTGGCAGAC 


AAGCCAGAAT 


TCACAGTAGA 


540 


CCCAGATGGC 


TTGATTACGT 


TAAAATAATA 


ATGGAATGTC 


TTTCAAATCA 


AAACCCCGCA 


600 


TATTATTAGG 


TCTTGAAAA 










620 



15 

(2) INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 116 amino acids 

20 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



Met Leu Ala Asp Arg Trp Arg Lys His Thr Asp Gly Asn Trp Tyr Trp 
1 5 10 15 

30 Phe Asp Asn Ser Gly Glu Met Ala Thr Gly Trp Lys Lys lie Ala Asp 
20 25 30 

Lys Trp Tyr Tyr Phe Asn Glu Glu Gly Ala Met Lys Thr Gly Trp Val 

35 40 45 

Lys Tyr Lys Asp Thr Trp Tyr Tyr Leu Asn Ala Lys Glu Gly Ala Met 
35 50 55 60 

Val Ser Asn Ala Phe lie His Ser Ala Gly Arg Asn Arg Leu Val Leu 
65 70 75 80 

Pro Gin Trp Asn Thr Gly Arg Gin Ala Arg lie His Ser Arg Pro Arg 
85 90 95 

40 Trp Leu Asp Tyr Val Lys lie lie Met Glu Cys Leu Ser Asn Gin Asn 
100 105 110 

Pro Ala Tyr Tyr 
115 
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(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 2695 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDE0NESS : single 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 





GGGAGAGAAG 


AGAGAGAGAG 


AGAGAGAAGG 


AGTAGGAGAG 


CGAGGAGAGG AGAATAAGGA 


60 




GTGAATGGAA 


GCAGTAAGCT 


AGATAGGCAG 


AGAGAGAGAG 


AACGGAGAGT 


AGGAGTGGAA 


120 


15 


GAAGTGGAAG 


TTGAGAACGA 


CAAGGAGAGA 


GAAAGGAAGA 


AAAGTAGAGA 


GACTAGAGAA 


180 




TAGAAGAGGA 


GAACAGAGAG 


GTAGGAGAAA 


GAGGAAAAGA 


AGAGAGAGAG 


AAGGCAGCGA 


240 




GAAAGAGAGG 


AGCAGGCGGA 


CAAGGAGAAG 


AGGGAGGAAG 


AGGAAAAGAG 


GAAGAGAGAA 


300 




GAAGAATGGT 


GGAGAGAGAA 


GAGGAAAGAG 


CACCCGCGCC 


ACCGAGGATT 


GGGAGATGAA 


360 




TTAGGGGCCC 


CTAAGAGGAC 


CGAAGACCCG 


GGCGTAGATT 


ATTCGCCCCG 


GAGGGCAAGG 


420 


20 


GAGGTCGACC 


GCAAAGTAAA 


TACACCACCA 


GGGAGGAGGG 


AAATATGAAC 


GCCGGCGGAG 


480 




ACCCGGGGCC CTAGATACTG GTGAAGACGA TCAAAAGTAT GGATATGCCG GTCGCCACCA 


540 




GCTTTTTGGC 


CCCGGACGGA 


ACGCCGCTGC 


AGTACGCGCT 


ATGCTTCCCG 


GCCGTCACCG 


600 




ACAAACTCGG 


CGCGCTGCTG 


ATGCGTCCCG 


AGGCGGCCTG 


CGTGCGGCCC 


CCGCTTCCGA 


660 




CGGACGTCCT 


CGAATCGGCC 


CCGACGGTCA 


CGGCCATGTA 


CGTGCTGACC 


GTCGTGAACC 


720 


25 


GGCTCCAGCT 


GGCCCTCAGC 


GACGCCCAGG 


CCGCCAACTT 


TCAGCTGTTC 


GGTCGCTTCG 


780 




TGCGCCATCG 


CCAGGCGACG 


TGGGGCGCCT 


CGATGGACGC 


GGCGGCCGAG 


CTGTACGTCG 


840 




CCCTCGTCGC 


CACCACCCTC 


ACGCGCGAGT 


TTGGGTGTCG 


CTGGGCCCAG 


CTGGGCTGGG 


900 




CGTCCGGAGC 


GGCGGCGCCG 


CGTCCGCCGC 


CGGGCCCCCG 


GGGGTCCCAG 


CGCCACTGCG 


960 




TCGCCTTCAA 


CGAGAACGAC 


GTGCTGGTCG 


CGCTGGTGGC 


CGGCGTTCCG 


GAACACATCT 


1020 


30 


ACAACTTCTG 


GCGCCTGGAC 


CTCGTTCGCC 


AGCACGAGTA 


CATGCACCTC 


ACCCTCGAAC 


1080 




GCGCGTTCGA 


GGACGCAGCG 


GAGTCCATGC 


TGTTCGTCCA 


GCGCCTGACC 


CCGCATCCCG 


1140 




ACGCCCGCAT 


CCGCGTGTTG 


CCGACGTTTT 


TGGACGGAGG 


CCCCCCGACC 


CGGGGCCTCC 


1200 




TGTTCGGCAC GCGGCTGGCC GACTGGCGCC GGGGCAAGCT GTCCGAAACC GACCCGCTGG 


1260 




CGCCCTGGCG 


CTCGGCCTTG 


GAGCTCGGGA 


CCCAGCGCCG 


GGACGCCCCG 


GCGCTCGGGA 


1320 


35 


AGCTCAGTCC 


GGCCCAGGCC 


CTGGCGGCGG 


TGAGCGTCCT 


CGGGCGCATG 


TGTCTGCCGA 


1380 




GCGCCGCTTT 


GGCCGCGCTG 


TGGACCTGCA 


TGTTTCCCGA 


CGACTACACC 


GAGTACGACA 


1440 




GCTTCGACGC 


CCTCCTGGCC 


GCACGCCTGG AGTCTGGCCA GACGCTCGGC 


CCGGCGGGGG 


1500 




GGCGCGAGGC GTCCCTCCCC GAGGCCCCCC ACGCCCTCTA CCGACCCACG GGCCAGCACG 


1560 




TGGCCGTGCT 


GGCCGCCGCG 


ACCCACCGCA 


CCCCCGCCGC 


GCGCGTTACG 


GCCATGGACC 


1620 


40 


TGGTTCTGGC 


CGCGGTGCTC 


CTCGGCGCGC 


CCGTCGTGGT GGCGCTCCGC 


AACACCACGG 


1680 




CCTTCTCCCG 


CGAGTCGGAA 


CTGGAACTGT 


GCCTGACGCT 


CTTCGACTCG 


CGCCCCGGCG 


1740 




GGCCGGACGC CGCCCTGCGC GACGTCGTGT CGTCCGACAT CGAGACGTGG GCCGTCGGCC 


1800 




TCCTCCACAC CGATCTCAAC CCGATCGAAA ACGCGTGTCT GGCGGCGCAG CTCCCGCGCC 


1860 
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TGTCGGCGCT CATCGCCGAG CGCCCTCTCG CCGACGGGCC 
ACATCTCCAT GACCCCGGTC GCGGTCTTGT GGGAAGCCCC 
ACGTGCGGTT TGTGGGCAGC GAGGCCACCG AGGAGCTTCC 
ACGTTCTTGC GGCGAGCGCC GCCGACGCGG ACCCCTTCTT 
5 GGCCCTTCGA CGCCTCCCTC CTGACGGGGG AGCTGTTCCC 
GCCCCCTCGC CGACGAGGCA GGTCCCTCTG CCCCGACCGC 
TTGCGGGGGG GGATGGCGGA TCGGGTCCCG AGGACCCCGC 
CGGACCCGGG GGTCCTCGCC CCCACTTTCC TCACCGACGC 
CCCCTCGCAT GTGGGCCTGG ATCCACGGCC TGGAGGAGCT 
10 GCCCCACGCC CAATCCGGCC CCGGCCTTAC TTCCCCCCCC 
CCACGTCCCA GTACGCACCG CGGCCCATCG GGCCGGCAGN 
CGAGTGTCCC GCCTCAACAA AACACGGGGC GCGTGCCCGT 
GGCCCTCGCC ACCCACACCG AGTCCCCCCG CGGATGCCGC 
CCGGGTTTGC CGCCGCTTTT TCCGCCGCCG TGCCGCGCGT 

15 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 718 amino acids 

20 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 

Val Lys Thr lie Lys Ser Met Asp Met Pro Val Ala Thr Ser Phe Leu 
15 10 15 

30 Ala Pro Asp Gly Thr Pro Leu Gin Tyr Ala Leu Cys Phe Pro Ala Val 
20 25 30 

Thr Asp Lys Leu Gly Ala Leu Leu Met Arg Pro Glu Ala Ala Cys Val 

35 40 45 

Arg Pro Pro Leu Pro Thr Asp Val Leu Glu Ser Ala Pro Thr Val Thr 
35 50 55 60 

Ala Met Tyr Val Leu Thr Val Val Asn Arg Leu Gin Leu Ala Leu Ser 
65 70 75 80 

Asp Ala Gin Ala Ala Asn Phe Gin Leu Phe Gly Arg Phe Val Arg His 
85 90 95 

40 Arg Gin Ala Thr Trp Gly Ala Ser Met Asp Ala Ala Ala Glu Leu Tyr 
100 105 110 

Val Val Ala Thr Thr Leu Thr Arg Glu Phe Gly Cys Arg Trp Ala Gin 
115 120 125 

281 



CCCGTGCCTG 


GTCCTCGTGG 


1920 


GGAGCCCCCC 


GGCCCCCCTG 


1980 


GTTTGTGGCT 


ACCGCGGGGG 


2040 


CGCGCGGGCC 


ATCCTCGGGC 


2100 


GGGACACCCG 


GTTTACCAGC 


2160 


CGCCCGCGAC 


CCGCGGGACC 


2220 


TGCCCCCCCC 


GCGCGGCAGG 


2280 


CACCACCGGC 


GAGCCCGTCC 


2340 


GGCGTCCGAG 


GACGCCGGCG 


2400 


CGCCACCGAT 


CAGTCCGTCC 


2460 


TACGGCTCGC 


GAAACACGAC 


2520 


GGCCCCTCGG 


GANGACCCAC 


2580 


GGTTCCTCCC 


CCGGCCTTTT 


2640 


GCGCAGATCC 


CGCC 
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Leu Gly Trp Ala Ser Gly Ala Ala Ala Pro Arg Pro Pro Pro Gly Pro 

130 135 140 

Arg Gly Ser Gin Arg His Cys Val Ala Phe Asn Glu Asn Asp Val Leu 
145 150 155 160 

5 Val Val Ala Gly Val Pro Glu His lie Tyr Asn Phe Trp Arg Leu Asp 

165 170 175 

Leu Val Arg Gin His Glu Tyr Met His Leu Thr Leu Glu Arg Ala Phe 

180 185 190 

Glu Asp Ala Ala Glu Ser Met Leu Phe Val Gin Arg Leu Thr Pro His 
10 195 200 205 

Pro Asp Ala Arg lie Arg Val Leu Pro Thr Phe Leu Asp Gly Gly Pro 

210 215 220 

Pro Thr Arg Gly Leu Leu Phe Gly Thr Arg Leu Ala Asp Trp Arg Arg 
225 230 235 240 

15 Gly Lys Leu Ser Glu Thr Asp Pro Leu Ala Pro Trp Arg Ser Ala Leu 

245 250 255 

Glu Leu Gly Thr Gin Arg Arg Asp Ala Pro Ala Leu Gly Lys Leu Ser 

260 265 270 

Pro Ala Gin Ala Ala Val Ser Val Leu Gly Arg Met Cys Leu Pro Ser 
20 275 280 285 

Ala Ala Ala Leu Trp Thr Cys Met Phe Pro Asp Asp Tyr Thr Glu Tyr 

290 295 300 

Asp Ser Phe Asp Ala Leu Leu Ala Ala Arg Leu Glu Ser Gly Gin Thr 
305 310 315 320 

25 Leu Gly Pro Ala Gly Gly Arg Glu Ala Ser Leu Pro Glu Ala Pro His 

325 330 335 

Ala Leu Tyr Arg Pro Thr Gly Gin His Val Ala Val Leu Ala Ala Ala 

340 345 350 

Thr Thr Pro Ala Ala Arg Val Thr Ala Met Asp Leu Val Leu Ala Ala 
30 355 360 365 

Val Leu Leu Gly Ala Pro Val Val Val Arg Asn Thr Thr Ala Phe Ser 

370 375 380 

Arg Glu Ser Glu Leu Glu Leu Cys Leu Thr Leu Phe Asp Ser Arg Pro 
385 390 395 400 

35 Gly Gly Pro Asp Ala Ala Leu Arg Asp Val Val Ser Ser Asp lie Glu 

405 410 415 

Thr Trp Ala Val Gly Leu Leu His Thr Asp Leu Asn Pro He Glu Asn 

420 425 430 

Ala Cys Leu Ala Ala Gin Leu Pro Arg Leu Ser Ala Leu He Ala Glu 
40 435 440 445 

Arg Pro Leu Ala Asp Gly Pro Pro Cys Leu Val Leu Val Asp He Ser 

450 455 460 

Met Thr Pro Val Ala Val Leu Trp Glu Ala Pro Glu Pro Pro Gly Pro 
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465 470 475 480 

Pro Asp Val Arg Phe Val Gly Ser Glu Ala Thr Glu Glu Leu Pro Phe 

485 , 490 495 

Val Ala Thr Ala Gly Asp Val Leu Ala Ala Ser Ala Ala Asp Ala Asp 
5 500 505 510 

Pro Phe Phe Ala Arg Ala lie Leu Gly Arg Pro Phe Asp Ala Ser Leu 

515 520 525 

Leu Thr Gly Glu Leu Phe Pro Gly His Pro Val Tyr Gin Arg Pro Leu 
530 535 540 

10 Ala Asp Glu Ala Gly Pro Ser Ala Pro Thr Ala Ala Arg Asp Pro Arg 
545 550 555 560 

Asp Leu Ala Gly Gly Asp Gly Gly Ser Gly Pro Glu Asp Pro Ala Ala 

565 570 575 

Pro Pro Ala Arg Gin Ala Asp Pro Gly Val Leu Ala Pro Thr Phe Leu 
15 580 585 590 

Thr Asp Ala Thr Thr Gly Glu Pro Val Pro Pro Arg Met Trp Ala Trp 

595 600 605 

lie His Gly Leu Glu Glu Leu Ala Ser Glu Asp Ala Gly Gly Pro Thr 
610 t . 615 620 

20 Pro Asn Pro Ala Pro Ala Leu Leu Pro Pro Pro Ala Thr Asp Gin Ser 
625 630 635 640 

Val Pro Thr Ser Gin Tyr Ala Pro Arg Pro lie Gly Pro Ala Xaa Thr 

645 650 655 

Ala Arg Glu Trp Ser Val Pro Pro Gin Gin Asn Thr Gly Arg Val " Pro 
25 660 665 670 

Val Ala Pro Arg Xaa Asp Pro Arg Pro Ser Pro Pro Thr Pro Ser Pro 

675 680 685 

Pro Ala Asp Ala Ala Val Pro Pro Pro Ala Phe Ser Gly Phe Ala Ala 
690 695 700 

30 Ala Phe Ser Ala Ala Val Pro Arg Val Arg Arg Ser Arg Arg 
705 710 715 

(2) INFORMATION FOR SEQ ID NO: 78: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
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GGGGGGAGGG AAACAAGCCC GATAGAGCGC TAATGAGAAG GGAGGATATA ATGAGGGACA 60 

TTGGGGGAGG GAAGGTAACA GGAAGGTTTT AGGAGCCCAA GAAGACCGGA GGACCCCCAA 120 

GACATTGGAG GAAATGGCCG AGGCCGTTAA GGAGGGAAGA GGTTCAGACA TCCCGGCCCC 180 

CACCCGTACA AGAGCCCAGG AGCCACCAAG CCCCGGGAGC GGGAAAACAA AAAGCCGCCC 240 

5 AAGCGGCCCG AGGCGACCCC GCCCCCCGAC GCCAACGCGA CCGTCGCCGC CGGCCACGCC 300 

ACGCTGCGCG CGCACCTGCG GGAAATCAAG GTCGAGAACG CCGATGCCCA GTTTTACGTG 360 

TGCCCGCCCC CGACGGGCGC CACGGTGGTG CAGTTTGAGC AGCCGCGCCG CTGCCCGACG 420 

CGCCCGGAGG GGCAGAACTA CACGGAGGGC ATCGCGGTGG TCTTCAAGGA GAACATCGCC 480 

CCGTACAAAT TCAAGGCCAC CATGTACTAC AAAGACGTGA CCGTGTCGCA GGTGTGGTTC 540 

10 GGCCACCGCT ACTCCCAGTT TATGGGGATA TTCGAGGACC GCGCCCCCGT TCCCTTCGAG 600 

GAGGTGATCG ACAAGATTAA CGCCAAGGGG GTCTGCCGCT CCACGGCCAA GTACGTGCGG 660 

AACAACATGG AGACCACCGC GTTTCACCGG GACGACCACG AGACCGACAT GGAGCTCAAG 720 

CCGGCGAAGG TCGCCACGCG CACGAGCCGG GGGTGGCACA CCACCGACCT CAAGTACAAC 780 

CCCTCGCGGG TGGAGGCGTT CCATCGGTAC GGCACGACGG TCAACTGCAT CGTCGAGGAG 840 

15 GTGGACGCGC GGTCGGTGTA CCCGTACGAT GAGTTTGTGT TGGCGACGGG CGACTTTGTG 900 

TACATGTCCC CGTTTTACGG CTACCGGGAG GGGTCGCACA CCGAGCACAC CAGCTACGCC 960 

GCCGACCGCT TCAAGCAGGT CGACGGCTTC TACGCGCGCG ACCTCACCAC GAAGGCCCGG 1020 

GCCACGTCGC CGACGACCCG CAACTTGCTG ACGACCCCCA AGTTTACCGT GGCCTGGGAC 1080 

TGGGTGCCGA AGCGACCGGC GGTCTGCACC ATGACCAAGT GGCAGGAGGT GGACGAGATG 1140 

20 CTCCGCGCCG AGTACGGCGG CTCCTTCCGC TTCTCCTCCG ACGCCATCTC GACCACCTTC 1200 

ACCACCAACC TGACCCAGTA CTCGCTCTCG CGCGTCGACC TGGGCGACTG CATTGGCCGG 1260 

GATGCCCGCG AGGCCATCGA CCGCATGTTT GCGCGCAAGT ACAACGCCAC GCACATCAAG 1320 

GTGGGCCAGC CGCAGTACTA CCTGGCCACG GGGGGCTTCC TCATCGCGTA CCAGCCCCTC 1380 

CTCAGCAACA CGCTCGCCGA GCTGTACGTG CGGGAGTACA TGCGGGAGCA GGACCGCAAG 1440 

25 CCCCGGAATG CCACGCCCGC GCCACTGCGG GAGGCGCCCA GCGCCAACGC GTCCGTGGAG 1500 

CGCATCAAGA CCACCTCCTC GATCGAGTTC GCCCGGCTGC AGTTTACGTA TAACCACATA 1560 

CAGCGCC ACG TGAACGACAT GCTGGGGCGC ATCGCCGTCG CGTGGTGCGA GCTGCAGAAC 1620 

CACGAGCTGA CTCTCTGGAA CGAGGCCCGC AAGCTCAACC CCAACGCCAT CGCCTCCGCC 1680 

ACCGTCGGCC GGCGGGTGAG CGCGCGCATG CTCGGAGACG TCATGGCCGT CTCCACGTGC 1740 

30 GTGCCCGTCG CCCCGGACAA CGTGATCGTG CAGAACTCGA TGCGCGTCAG CTCGCGGCCG 1800 

GGGACGTGCT ACAGCCGCCC CCTGGTCAGC TTTCGGTACG AAGACCAGGG CCCGCTGATC 1860 

GAGGGGCAGC TGGGCGAGAA CAACGAGCTG CGCCTCACCC GCGACGCGCT CGAGCCGTGC 1920 

ACCGTGGGCC ACCGGCGCTA CTTCATCTTC GGCGGGGGCT ACGTGTACTT CGAGGAGTAC 1980 

GCGTACTCTC ACCAGCTGAG TCGCGCCGAC GTCACCACCG TCAGCACCTT CATCGACCTG 2040 

35 AACATCACCA TGCTGGAGGA CCACGAGTTT GTGCCCCTGG AGGTCTACAC GCGCCACGAG 2100 

ATCAAGGACA GCGGCCTGCT GGACTACACG GAGGTCCAGC GCCGCAACCA GCTGCACGAC 2160 

CTGCGCTTTG CCGACATCGA CACGGTCATC CGCGCCGACG CCAACGCCGC CATGTTCGCG 2220 

GGGCTGTGCG CGTTCTTCGA GGGGATGGGG GACTTGGGGC GCGCGGTCGG CAAGGTAGTC 2280 

ATGGGAGTAG TGGGGGGCGT GGTGTCGGCC GTCTCGGGCG TGTCCTCCTT TATGTCCAAC 2340 

40 CCCTTCGGGG CGCTTGCCGT GGGGCTGCTG GTCCTGGCCG GCCTGGTCGC GGCCTTCTTC 2400 

GCCTTCCGCT ACGTCCTGCA ACTGCAACGC AATCCCATGA AGGCCCTGTA TCCGCTCACC 2460 

ACCAAGGAAC TCAAGACTTC CGACCCCGGG GGCGTGGGCG GGGAGGGGGA GGAAGGCGCG 2520 

GAGGGGGGCG GGTTTGACGA GGCCAAGTTG GCCGAGGCCC GAGAAATGAT CCGATATATG 2580 
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GNTTTGGTGT CGGCCATGGA GCGCACGGAA CACAAGGCCA GAAAGAAGGG CACGAGCGCC 2640 

CTGCTCAGCT CCAAGGTCAC CAACATGGTT CTGCGCAAGC GCAACAAAGC CAGGTACTCT 2700 

CCGCTCCACA ACGAGGACGA GGCCGGAGAC GAAGACGAGC TCTAAGGGAG GGGAGGGGAG 2760 

CTGGGCTTGT GTATAAATAA AAAGACACCG ATGTTCAAAA ATACACATGA CTTCTNGGTA 2820 

5 TGTNTGGGTA CCGAGCTCGA A 2842 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 787 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 9.: 

Val Cys Pro Pro Pro Thr Gly Ala Thr Val Val Gin Phe Glu Gin Pro 
20 1 5 10 15 

Arg Arg Cys Pro Trp Glu Gly Gin Asn Tyr Thr Glu Gly He Ala Val 

20 25 30 

Val Phe Lys Glu Asn He Ala Pro Tyr Lys Phe Lys Ala Thr Met Tyr 
35 40 45 

25 Tyr Lys Asp Val Thr Val Ser Gin Val Trp Phe Gly His Arg Tyr Ser 
50 55 60 

Gin Phe Met Gly He Phe Glu Asp Arg Ala Pro Val Pro Phe Glu Glu 
65 70 75 80 

Val lie Asp Lys He Asn Ala Lys Gly Val Cys Arg Ser Thr Ala Lys 
30 85 90 95 

Tyr Val Arg Asn Asn Met Thr Ala Phe His Arg Asp Asp His Glu Thr 

100 105 110 

Asp Met Glu Leu Lys Pro Ala Lys Val Ala Thr Arg Thr Ser Arg Gly 
115 120 125 

35 Trp His Thr Thr Asp Leu Lys Tyr Asn Pro Ser Arg Val Glu Ala Phe 
130 135 140 

His Arg Tyr Gly Thr Thr Val Asn Cys He Val Glu Glu Val Asp Ala 
145 150 155 160 

Arg Ser Val Tyr Pro Tyr Asp Glu Phe Val Leu Ala Thr Gly Asp Phe 
40 165 170 175 

Val Tyr Met Ser Pro Phe Tyr Gly Tyr Arg Glu Gly Ser His Thr Glu 

180 185 190 

His Thr Ser Tyr Ala Ala Asp Arg Phe Lys Gin Val Asp Gly Phe Tyr 
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10 



15 



20 



25 



30 



35 



40 



195 200 205 

Ala Arg Asp Leu Thr Thr Lys Ala Arg Ala Thr Ser Pro Thr Thr Arg 

210 215 220 

Asn Leu Leu Thr Thr Pro Lys Phe Thr Val Ala Trp Asp Trp Val Pro 
225 230 235 240 

Lys Arg Pro Ala Val Cys Thr Met Thr Lys Trp Gin Glu Val Asp Glu 

245 250 255 

Met Leu Arg Ala Glu Tyr Gly Gly Ser Phe Arg Phe Ser Ser Asp Ala 

260 265 270 

He Ser Thr Thr Phe Thr Thr Asn Leu Thr Gin Tyr Ser Leu Ser Arg 

275 280 285 

Val Asp Leu Gly Asp Cys He Gly Arg Asp Ala Arg Glu Ala He Asp 

290 295 300 

Arg Met Phe Ala Arg Lys Tyr Asn Ala Thr His He Lys Val Gly Gin 
305 310 315- 320 

Pro Gin Tyr Tyr Leu Ala Thr Gly Gly Phe Leu He Ala Tyr Gin Pro 

325 330 335 

Leu Leu Ser Asn Thr Leu Ala Glu Leu Tyr Val Arg Glu Tyr Met Arg 

340 345 350 

Glu Gin Asp Arg Lys Pro Arg Asn Ala Thr Pro Ala Pro Leu Arg Glu 

355 360 365 

Ala Pro Ser Ala Asn Ala Ser Val Glu Arg He Lys Thr Thr Ser Ser 

370 375 380 

He Glu Phe Ala Arg Leu Gin Phe Thr Tyr Asn His He Gin Arg His 
385 390 395 400 

Val Asn Asp Met Leu Gly Arg He Ala Val Ala Trp Cys Glu Leu Gin 

405 410 415 

Asn His Glu Leu Thr Leu Trp Asn Glu Ala Arg Lys Leu Asn Pro Asn 

420 425 430 

Ala He Ala Ser Ala Thr Val Gly Arg Arg Val Ser Ala Arg Met Leu 

.435 440 445 

Gly Asp Val Met Ala Val Ser Thr Cys Val Pro Val Ala Pro Asp Asn 

450 455 460 

Val He Val Gin Asn Ser Met Arg Val Ser Ser Arg Pro Gly Thr Cys 
465 470 475 480 

Arg Pro Leu Val Ser Phe Arg Tyr Glu Asp Gin Gly Pro Leu He Glu 

485 490 495 

Gly Gin Leu Gly Glu Asn Asn Glu Leu Arg Leu Thr Arg Asp Ala Leu 

500 505 510 

Glu Pro Cys Thr Val Gly His Arg Arg Tyr Phe lie Phe Gly Gly Gly 

515 520 525 

Tyr Val Tyr Phe Glu Glu Tyr Ala Tyr Ser His Gin Leu Ser Arg Ala 
530 535 540 
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Asp Val Thr Thr Val Ser Thr Phe lie Asp Leu Asn lie Thr Met Leu 
545 550 555 560 

Glu Asp His Glu Phe Val Pro Leu Glu Val Tyr Thr Arg His Glu lie 
565 570 575 

5 Lys Asp Ser Gly Leu Leu Asp Tyr Thr Glu Val Gin Arg Arg Asn Gin 
580 585 590 

Leu His Asp Leu Arg Phe Ala Asp lie Asp Thr Val lie Arg Ala Asp 

595 600 605 

Ala Asn Ala Ala Met Phe Ala Gly Leu Cys Ala Phe Phe Glu Gly Met 
10 610 615 620 

Gly Asp Leu Gly Arg Ala Val Gly Lys Val Val Met Gly Val Val Gly 
625 630 635 640 

Gly Val Val Ser Ala Val Ser Gly Val Ser Ser Phe Met Ser Asn Pro 
645 650 655 

15 Phe Gly Ala Val Gly Leu Leu Val Leu Ala Gly Leu Val Ala Ala Phe 
660 665 670 

Phe Ala Phe Arg Tyr Val Leu Gin Leu Gin Arg Asn Pro Met Lys Ala 

675 680 685 

Leu Tyr Pro Leu Thr Thr Lys Glu Leu Lys Thr Ser Asp Pro Gly Gly 
20 690 695 700 

Val Gly Gly Glu Gly Glu Glu Gly Ala Glu Gly Gly Gly Phe Asp Glu 
705 710 715 720 

Ala Lys Leu Ala Glu Ala Arg Glu Met lie Arg Tyr Met Xaa Leu Val 
725 730 735 

25 Ser Ala Met Glu Arg Thr Glu His Lys Ala Arg Lys Lys Gly Thr Ser 
740 745 750 

Ala Leu Leu Ser Ser Lys Val Thr Asn Met Val Leu Arg Lys Arg Asn 

755 760 765 

Lys Ala Arg Tyr Ser Pro Leu His Asn Glu Asp Glu Ala Gly Asp Glu 
30 770 775 780 

Asp Glu Leu 
785 



35 



(2) INFORMATION FOR SEQ ID NO: 80: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
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GAGGAAGAGA 


GGGGGGAGGG 


GAAGAGAAAA 


GAGAAAAGGA 


AGAGGAGGGA 


GGAGAAGAAG 


60 




GGAAGGGTAA 


GAGGGGAAGA 


AGGGAANAAG 


GAAAAAGGAG 


TGTGAGAGGA 


GGTAGGAGAT 


120 




GAAGAGAAAA 


GAGGGGGAGA 


GAGAAGGGAA 


AAAGAAGTGG 


AAGGGAGGGA 


GAAATAGGGG 


180 


5 


AGAGGAGAAA 


AGTAAGATTA 


GGAGGTGGAG 


AGGGGAGGAA 


GAGGAATAAG 


ATAGGTAGAG 


240 




TAGGTGAAGG 


TGGGAGAAGG 


AGAGATGTAG 


GGATAGGGAA 


AAGGGGGGGG 


AGGGGAGATG 


300 




AGATAGAGAG 


GAGGGGGGAA 


AGGAAGGGGA 


TGAAGAGGAG 


GGGAGAGGGA 


GGGGGGGAGA 


360 




AAGGAGGTGA 


GGGGGGAGGG 


GGAAGAGGGG 


GGAGGGGAGG 


GGGGAGAGAA 


GAANAGAAAA 


420 




NNAAGNNNCC 


CCGGCCGCGT 


CCCCGNTGGA 


GCCCCTGGGG 


GACCCGACCC 


TGTGGCGGGC 


480 


10 


GCTGTATGCG 


TGCGTCCTGG 


CGGCCCTGGA 


GCGCCAGACG 


GGGCCGGTGG 


CCCTCTTCGT 


540 




CCCGCTGCGC 


CTGGGCTGGG 


ACCCGCAGAC 


GGGTCTGGTC 


GTGAGGGTCG 


AAAGGGCGTC 


600 




GTGGGGCCCG 


CCGGCCGCTC 


CTCGCGCCGC 


CCTCCTGGAC 


GTGGAGGCCA 


AGGTCAACTT 


660 




CAACCCGCTG 


GCCCTGGCCG 


CGCGCGTCGC 


CGAGCACCCC 


GGCGCGCGGT 


TGGCGTGGGC 


720 




GCGCCTGGCC 


GCCATTCGCA 


ACAGCCCCCA 


GTGCGCGTCC 


TCCGCCTCGC 


TCGCCGTCAC 


780 


15 


CATCACGACG 


AGGACCGCGC 


GTTTCGCGCG 


CGAATACACC 


ACCCTGGCGT 


TTCCGCCGAC 


840 




CAGCAAGGAG 


GGCGCCTTCG 


CGGACCTGGT 


CGAGGTGTGC 


GAGGTATGCC 


TGCGGCCCCG 


900 




CGGACACCCG 


CATCGGGTCA 


CGGCGCGGGT 


GCTGCTGCCG 


CGCGGCTACA 


ACTACTTCGT 


960 




GAGCGCCGGC 


GACGGGTTCT 


CCGCCCCGGC 


GCTGGTCGCC 


CTCTTCCGGC 


AGTGGCATAC 


1020 




CACGGTCCAC 


CCCGCCCCCG 


GAGCCCTGGC 


CCCCGTCTTC 


GCTTTTCTGG 


GGCCCGGGTT 


1080 


20 


TGAGGTCCGG 


GGAGGGCCCC 


TCCAATACTT 


TGCCGTGCTG 


GGATTTCCGG 


GCTGGCCCCC 


1140 




CTTTACCGTG 


CCGGCCGCCG 


CCGCCGCCGA 


ATCGGTGCGT 


GACCTGCTGC 


GGGGCGCCGC 


1200 




GTGCACCCAT 


CCCCTTTGCC 


CTGGGGGCCC 


TGGCCCGCGG 


TGGGCGCCTA 


GGTCTTCCTG 


1260 




CCCCCGCGGG 


CATGGCCGGC 


CGTGGCCTCG 


GAGGCGGCCG 


GCCGCCTCCT 


GCCCGCCTTT 


1320 




CGGGAAGCGG 


TGGCGCGGTG 


GCACCCCACG 


GCCACCACCA 


TCCAACTACT 


CGACCCCCCG 


1380 


25 


GCGGCCGTCG 


GGCCGGTCTG 


GACGGCGCGG 


TTTTGTTTCT 


CCGGGCTCCA 


GGCCCAGCTC 


1440 




CTGGCCGCCC 


TCGCGGGCCT 


CGGGGAGGCC 


GGGCTGCCGG 


AAGCCCGGGG 


GCGGGCGGGC 


1500 




CTGGAAAGGC 


TGGACGCGCT 


GGTGGCGGCC 


GCCCCCTCGG 


AGCCCTGGGC 


CCGGGCCGTG 


1560 




CTGGAGCGCC 


TGGTGCCGGA 


CGCGTGCGAC 


GCCTGCCCCG 


CGCTCCGGCA 


GCTGCTCGGC 


1620 




GGGGTCATGG 


CCGCCGTCTG 


CCTGCAGATC 


GAGCAGACGG 


CCAGCTCGGT 


GAAGTTTGCG 


1680 


30 


GTCTGCGGCG 


GCACCGGGGC 


TGCGTTCTGG 


GGGCTGTTCA 


ACGTGGACCC 


CGGGGACGCG 


1740 




GACGCCGCGC 


ACGGCGCGAT 


CCATGACGCC 


CGCCGGGCCC 


TCGAGGCGTC 


CGTGCGCGCC 


1800 




GTACTTTCGG 


CCAACGGCAT 


ACGCCCGCGC 


CTCGCCCCCT 


CCCTGGCGCT 


AGAGGGCGTC 


1860 




TACACCCACG 


TCGTCACCTG 


GAGCCAGACC 


GGGGCGTGGT 


TCTGGAACTC 


CCGCGATGAC 


1920 




ACCGACTTCC 


TGCAGGGATT 


TCCTCTCCGC 


GGGCCCGCGT 


ACGCCGCGGC 


GGCCGAGGTT 


1980 


35 


ATGCGCGACG 


CGCTGAGACG 


AATCCTCCGG 


CGGCCGGCCG 


CCGGCCCGCC 








GTGTGCGCGG 


CCCGGGGCAT 


CATGGAGGAC 


GCCTGTGACC 


GCTTTGTCCT 


GGATGCCTTC 


2100 




GGGAGGCGTC 


TGGACGCGGA 


GTACTGGAGC 


GTTCTGACCC 


CCCCGGGCGA 


GGCCGACGAC 


2160 




CCCCTGCCCC 


AAACGGCCTT 


CCGCGGAGGC 


GCCCTGCTGG 


ACGCGGAGCA 


ATACTGGAGA 


2220 




CGCGTCGTGC 


GCGTATGTCC 


CGGGGGCGGG 


GAGTCGGTCG 


GCGTCCCCGT 


GGATCTGTAC 


2280 


40 


CCGCGGCCCT 


TGGTGCTCCC 


CCCCGTGGAC 


TGCGCCCATC 


ACCTGCGCGA 


GATCCTGCGC 


2340 




GAGATTCAAC 


TGGTGTTTAC 


GGGGGTTCTG 


GAAGGCGTGT 


GGGGCGAGGG 


CGGGAGCTTT 


2400 




GTGTACCCTT 


TCGAGGAAAA 


GATGCGGTTT 


CTGTTTCCCT 


GAATTTGGTC 


AATAAACTGG 


2460 




GGCCCCGTGC 


TCCAACTTAC 


CCCCGCGTGT 


GCGCGCGTCC 


GTATTTACTG 


ACACGCGCCG 


2520 
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GTTGTGGTTT 


lv»l lv X t\. X X X 


CTTTGTTCCC 


TCTATCATGT CTTTCCACCA CCAGCACCAC 


2580 






V^X X XLUl^Ul 


TCGGTGCACA 


AGACACACAC 


ACAGGCCCAC 


CACCATCCCC 


2640 






PAPGAPAGGT 


AGGGAAPGTT 


CCATAAAAAA 


CACGTTTATT 


TTCCGGAGTT 


2700 




app a a a a pen 


ATAGAAAAGC 


GACGAGGTCC 

\JX±\~ \JxxVJ\J X \^ 


GTCGTTTGGG 


GCTCCCCGAA 


AGCCACCAAT 


2760 




apappagagp 


PGAAPGPAGG 


X w X X \J\Jii X x 


TCCAGCAGCT 


TCCCATGACG 


CCGGCCGGGT 


2820 




ta'pappppap 


/\V»/vj I\.^U X V7 


PGGGGGAPGG 


GCCCGGGCAA 


CTGCAACGCA AAGCTCTGCG 


2880 






A A AP AGGGPP 


GAAAGGACGG 


GGGGCGGATT 


GTTGGCCAGC 


AGGTAGTGGG 


2940 






GTGGGGPAGG 


APTAGPTPGT 


CGTCGAAGGG 


CTTCACGCCC 


GCCATGCACA 


3000 




1 1 noLU l O X ± 


PAGGATPPAG 


TGGCAGPTGC 


GGAGGAGAAG 


GCAGCGGACG 


CGCTCGAAGG 


3060 


10 


riTi.n.'h. CCZCCZ'VCZ 
Vj/iVjAv* vj v» Vj a vj 


PPflGPPP A P A 


ACCCCP A AC C 


ACGCCGACAC 


CTTGACAAAC 


AGCGAGGGCG 


3120 




tggpp.tggpt 


PPGGGGAPAG 


TTCTCCAGGT 

x x i» x v— x 


ACATCAGCAG 


GCAGACGAGC 


TCGAAGTCCC 


3180 




uo/iuu X v_.v—vj x 


P.CGCPGAAAP 


AAGGAGAGCC 


GGTGCAGCAG 


AACGAGGGGG 


GGCACGGCCA 


3240 




GGPTGTGP AC 


GTGGTPPTPG 


TTGGCCGTGA 

X X w\JV w \J X VJxV 


GGACGACGAC 


CGCAAAACCG 


CGATACGTCG 


3300 




1 vjVj A V. V. V- 


GPAGPGPTTA 


AAGTAATCCT 


CCGACGCCAC 


GTACGCGTCG 


TCGGAACTCC 


3360 


15 

X %s 


CGTCCAGAAC 


GAAPCGCAAC 


CGCCCCCTGG 


GGGTGACGTC AACGCGCAGG ACGCTGGTCG 


3420 




CGGTAAACCG 


PGGPTGGPGA 


TCGCTGACCT 


GGCGCACCTC 


GCAGGCCATG 


CGCAGCAGCG 


J *± O v 




cctggttgct 


GATPPPPTPC 


GCCACCTCGA 


CCAGACTGCG 


GTCCCCGGCG 


ATGGCCTGTT 


3540 




TGAGGATGGC 


GGCGGCCGTT 


CCCTCATCGG 


CGGGCGTGGG 


GTCGGCCATC 


CCTGCGTTGG 






ACGCCCCAGC 


CPTGGTCCGG 


CGCACCCCTC 


GGCGTTCTCC 


CGGGCGACCG 


GGATCGGGTC 


3 660. 


20 


CGGGTCCGGG 


ACCGGGACCC 


GCCCCGCGGG 


GACGCGCTCG 


CCCGGAAATC 


GGCGGGGGTT 


3720 




GGGGAGGGGG 


GCCGGGGCAG 


AGCCGCGTGC 


TGTACGTCCG 


CCACGAACAG 


GGCCGCGACG 






TCTGTCAGGT 


ACGTCTGCAG 


\JT\_VJl3w X X X X X 


TTAAAGACCG 


CCTCCCATAA 


CTCCTCCTCC 


1 RAO 




CCTAGGATGA 


CATCGGAGCC 


GGTGATGAGC 


GCGCCCGCTC 


GGGGGGCGCG 


AAGCACGTAC 


3900 




TCGAAATACG 


GGGCCACGAA 


GGAGGCGATC 


GCCCCGCTAG 


AGTACGAGAT 


CGACGTTTCC 


3960 


25 


TGGCCCTGGT 


TGTTGCGGTG 


GCGCAGAATC 


TTGAAGCAGC 


GCACCAGCTC 


GTGCTCCCAG 


4020 




AGGCGCGACA 


GGCGCTCGAG 


GTCCTGGCCG 


TACGCGGGGA 


TGTACTGGTG 


CTGGAAACTG 


4080 




TTGGCCACGT 


ACGTGTCGTC 


GTCCATGGAC 


TTGCTGACGT 


CGATAATGTC 


GTAGTCGGCC 


4140 




CGGAGAAGAT 


CCGCCTCCGC 


CGGGCGGCCC 


GCGCCTCCCC 


CGGCCGCCCG 


GTCCGCCGCG 


4200 




CGATGCTCCC 


GCTCCAGCGC 


CCCCGCCTGG 


GCGCGCCGCA 


GCTCGCGGTC 


GCGCGCCTGC 


4260 


30 


AGCTGGGTCG 


CCGGGGACAT 


CTAGAGTCG 








4290 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 373 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE. DESCRIPTION: SEQ ID NO: 81: 
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Val Phe Val Pro Leu Arg Leu Gly Trp Asp Pro Gin Thr Gly Leu Val 

15 10 15 

Val Arg Val Glu Arg Ala Ser Trp Gly Pro Pro Ala Ala Pro Arg Ala 
20 25 30 

5 Ala Leu Leu Asp Val Glu Ala Lys Val Asn Phe Asn Pro Leu Ala Ala 
35 40 45 

Arg Val Ala Glu His Pro Gly Ala Arg Leu Ala Trp Ala Arg Leu Ala 

50 55 60 

Ala lie Arg Asn Ser Pro Gin Cys Ala Ser Ser Ala Ser Leu Ala Val 
10 65 70 75 80 

Thr lie Thr Thr Arg Thr Ala Arg Phe Ala Arg Glu Tyr Thr Thr Leu 

85 90 95 

Ala Phe Pro Pro Thr Ser Lys Glu Gly Ala Phe Ala Asp Leu Val Glu 
100 105 110 

15 Val Cys Glu Val Cys Leu Arg Pro Arg Gly His Pro His Arg Val Thr 
115 120 125 

Ala Arg Val Leu Leu Pro Arg Gly Tyr Asn Tyr Phe Val Ser Ala Gly 

130 135 140 

Asp Gly Phe Ser Ala Pro Ala Leu Val Phe Arg Gin Trp His Thr Thr 
20 145 150 155 160 

Val His Pro Ala Pro Gly Ala Pro Val Phe Ala Phe Leu Gly Pro Gly 

165 170 175 

Phe Glu Val Arg Gly Gly Pro Leu Gin Tyr Phe Ala Val Leu Gly Phe 
180 185 190 

25 Pro Gly Trp Pro Pro Phe Thr Val Pro Ala Ala Ala Ala Ala Glu Ser 
195 200 205 

Val Arg Asp Leu Leu Arg Gly Ala Ala Cys Thr His Pro Leu Cys Pro 

210 215 220 

Gly Gly Pro Gly Pro Arg Trp Ala Pro Arg Ser Ser Cys Pro Arg Gly 
30 225 230 235 240 

His Gly Arg Pro Trp Pro Arg Arg Arg Pro Ala Ala Ser Cys Pro Pro 

245 250 255 

Phe Gly Lys Arg Trp Arg Gly Gly Thr Pro Arg Pro Pro Pro Ser Asn 
260 265 270 

35 Tyr Ser Thr Pro Arg Arg Pro Ser Gly Arg Ser Gly Arg Arg Gly Phe 
275 280 285 

Val Ser Pro Gly Ser Arg Pro Ser Ser Trp Pro Pro Ser Arg Ala Ser 

290 295 300 

Gly Arg Pro Gly Cys Arg Lys Pro Gly Gly Gly Arg Ala Trp Lys Gly 
40 305 310 315 320 

Trp Thr Arg Trp Trp Arg Pro Pro Pro Arg Ser Pro Gly Pro Gly Pro 

325 330 335 

Cys Trp Ser Ala Trp Cys Arg Thr Arg Ala Thr Pro Ala Pro Arg Ser 
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340 345 350 

Gly Ser Cys Ser Ala Gly Ser Trp Pro Pro Ser Ala Cys Arg Ser Ser 

355 360 365 

Arg Arg Pro Ala Arg 
5 370 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 380 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Val Ala Ser Glu Ala Ala Gly Arg Leu Leu Pro Ala Phe Arg Glu Ala 
20 1 5 10 15 

Val Ala Arg Trp His Pro Thr Ala Thr Thr lie Gin Leu Leu Asp Pro 

20 25 30 

Pro Ala Ala Val Gly Pro Val Trp Thr Ala Arg Phe Cys Phe Ser Gly 
35 40 45 

25 Leu Gin Ala Gin Leu Leu Ala Ala Gly Leu Gly Glu Ala Gly Leu Pro 
50 55 60 

Glu Arg Arg Ala Gly Leu Glu Arg Leu Asp Ala Leu Val Ala Ala Ala 
65 70 '75 80 

Pro Ser Glu Pro Trp Ala Arg Ala Val Leu Glu Arg Leu Val Pro Asp 
30 85 90 95 

Ala Cys Asp Ala Cys Pro Ala Leu Arg Gin Leu Leu Gly Gly Val Met 

100 105 110 

Ala Ala Val Cys Leu Gin He Glu Gin Thr Ala Ser Ser Val Lys Phe 
115 120 125 

35 Ala Val Cys Gly Gly Thr Gly Ala Ala Phe Trp Gly Leu Phe Asn Val 
130 135 140 

Asp Pro Gly Asp Ala Asp Ala Ala His Gly Ala He His Asp Ala Arg 
145 150 155 160 

Arg Ala Leu Glu Ala Ser Val Arg Ala Val Leu Ser Ala Asn Gly He 
40 165 170 175 

Arg Pro Arg Leu Ala Pro Ser Leu Ala Leu Glu Gly Val Tyr Thr His 

180 185 190 

Val Val Thr Trp Ser Gin Thr Gly Ala Trp Phe Trp Asn Ser Arg Asp 

291 



WO 98/20016 



PCT/US97/20016 



205 

Arg Gly Pro Ala Tyr Ala 
220 

Arg Arg He Leu Arg Arg 
235 240 
Cys Ala Arg He Met Glu 
255 

Phe Gly Arg Arg Leu Asp 
270 

Gly Glu Ala Asp Asp Pro 
285 

Leu Leu Asp Ala Glu Gin 
300 

Gly Gly Gly Glu Ser Val 
315 320 
Leu Val Leu Pro Pro Val 
335 

Arg Glu He Gin Leu Val 
350 

Glu Gly Gly Ser Phe Val 
365 

Phe Pro 
380 

25 . (2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 302 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Val Arg Arg Thr Arg Ala Gly Asn Ala Gly Met Ala Asp Pro Thr Pro 

1 5 10 15 

Ala Asp Glu Gly Thr Ala Ala Ala He Leu Lys Gin Ala He Ala Gly 
40 20 25 30 

Asp Arg Ser Leu Val Glu Val Ala Glu Gly He Ser Asn Gin Ala Leu 

35 40 45 

Leu Arg Met Ala Cys Glu Val Arg Gin Val Ser Asp Arg Gin Pro Arg 

292 



195 200 
Asp Thr Asp Phe Leu Gin Gly Phe Pro Leu 

210 215 
Ala Ala Ala Glu Val Met Arg Asp Ala Leu 
5 225 230 

Pro Ala Ala Gly Pro Pro Glu Glu Ala Val 
245 250 
Asp Ala Cys Asp Arg Phe Val Leu Asp Ala 
260 265 
10 Ala Glu Tyr Trp Ser Val Leu Thr Pro Pro 
275 280 
Leu Pro Gin Thr Ala Phe Arg Gly Gly Ala 

290 295 
Tyr Trp Arg Arg Val Val Arg Val Cys Pro 
15 305 310 

Gly Val Pro Val Asp Leu Tyr Pro Arg Pro 
325 330 
Asp Cys Ala His His Leu Arg Glu He Leu 
340 345 
20 Phe Thr Gly Val Leu Glu Gly Val Trp Gly 
355 360 
Tyr Pro Phe Glu Glu Lys Met Arg Phe Leu 
370 375 
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50 55 60 

Phe Thr Ala Thr Ser Val Leu Arg Val Asp Val Thr Pro Arg Gly Arg 
65 70 .. 75 80 

Leu Arg Phe Val Leu Asp Gly Ser Ser Asp Asp Ala Tyr Val Ala Ser 
5 85 90 95 

Glu Asp Tyr Phe Lys Arg Cys Gly Asp Gin Pro Tyr Gly Phe Ala Val 

100 105 110 

Val Val Leu Thr Ala Asn Glu Asp His Val His Ser Leu Ala Val Pro 
115 120 125 

10 Pro Leu Val Leu Leu His Arg Leu Ser Leu Phe Arg Pro Thr Asp Leu 
130 135 140 

Arg Asp Phe Glu Leu Val Cys Leu Leu Met Tyr Leu Glu Asn Cys Pro 
145 150 155 160 

Arg Ser His Ala Thr Pro Ser Leu Phe Val Lys Val Ser Ala Trp Leu 
15 165 170 175 

Gly Val Val Ala Arg His Asp Phe Glu Arg Val Arg Cys Leu Leu Leu 

180 185 190 

Arg Ser Cys His Trp lie Leu Asn Thr Leu Met Cys Met Ala Gly Val 
195 200 205 

20 Lys Pro Phe Asp Asp Glu Leu Val Leu Pro His Trp Tyr Met Ala His 
210 215 220 

Tyr Leu Leu Ala Asn Asn Pro Pro Pro Val Leu Ser Ala Leu Phe Cys 
225 230 235 240 

Ala Thr Pro Gin Ser Phe Ala Leu Gin Leu Pro Gly Pro Val Pro Arg 
25 245 250 255 

Thr Asp Cys Val Ala Tyr Asn Pro Ala Gly Val Met Gly Ser Cys Trp 

260 265 270 

Lys Ser Lys Asp Leu Arg Ser Ala Leu Val Tyr Trp Trp Leu Ser Gly 
275 280 285 

30 Ser Pro Lys Arg Arg Thr Ser Ser Leu Phe Tyr Arg Phe Cys 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 84: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
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Met Ser Pro Ala Thr Gin Leu Gin Ala Arg Asp Arg Glu Leu Arg Arg 

1 5 ..10 15 

Ala Gin Ala Gly Ala Leu Glu Arg Glu His Arg Ala Ala Asp Arg Ala 
5 20 25 30 

Ala Gly Gly Gly Ala Gly Arg Pro Ala Glu Ala Asp Leu Leu Arg Ala 

35 40 45 

Asp Tyr Asp lie lie Asp Val Ser Lys Ser Met Asp Asp Asp Thr Tyr 
50 55 60 

10 Val Ala Asn Ser Phe Gin His Gin Tyr lie Pro Ala Tyr Gly Gin Asp 
65 70 75 80 

Leu Glu Arg Leu Ser Arg Leu Trp Glu His Glu Leu Val Arg Cys Phe 

85 90 95 

Lys lie Leu Arg His Arg Asn Asn Gin Gly Gin Glu Thr Ser lie Ser 
15 100 105 110 

Tyr Ser Ser Gly Ala lie Ala Ser Phe Val Ala Pro Tyr Phe Glu Tyr 

115 120 125 

Val Leu Arg Ala Pro Arg Ala Gly Ala Leu lie Thr Gly Ser Asp Val 
130 135 140 

20 He Leu Gly Glu Glu Glu Leu Trp Glu Ala Val Phe Lys Lys Thr Arg 
145 150 155 160 

Leu Gin Thr Tyr Leu Thr Asp Val Ala Ala Leu Phe Val Ala Asp Val 

165 170 175 

Gin His Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr Pro Ala Asp Phe 
25 180 185 190 

Arg Ala Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr Arg Thr Arg Ser 

195 200 205 

Arg Ser Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp Gin Gly Trp Gly 
210 215 220 

30 Val Gin Arg Arg Asp Gly Arg Pro His Ala Arg Arg 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 85: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3664 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
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GTGTGTTTTT GTTTGTCTCC ACTTGAAGAG GCGTAAATTT GAGTTTCTAG GGGGGGGCCA 60 

GAGGAAAACC ANCAAGGCCC TTTAGGTTTT CGCCCTTNGN GGCCCNTGGT AACGTTTTAT 120 

CCGGNGTTAT CCAGATAGGG GAGACCCANA CCCCCTTGAG GAGGNAAACC TTTCCCCACC 180 

CGNACCCGCG CCCCAGATTT AGTGAGGGGG ANGGAAGAGC CCCAAACACC NACCCCCTTT 240 

5 CCGGNGGGTN GCGATTTAAT ATGCANTGCA GACAGTTCTC GATTGGAACG GGCATGGCGC 300 

AACCANTNAT GGGTNGTCAT CTTGCACCCC GCTACATTAA GTTCGTTTGA AGTGGGGATG 360 

GGGGTAACAT TAACAGAACA GTTAGCCAGA TACGCCAGGG GCATTACCTC ATAAAGGACA 420 

AAGTGAGTTC CACGCGTGCG CCGTTTTAGA TTAGTGATCC CCCGGCTGCA GGATTCGATN 480 

GGGAGACAGT CACGAGTCWN GGACCACGTC GGNGACCCAG GCCCCAGNNT GTGTCCNCCC 540 

10 AGCCCCCCAG TCATGACGTT TGTGAGCACG ACGAGTCTGC GGCCGGGCTG GGGGCGCGTC 600 

TTCGTTCGCG TGGGCCATCA CTTCCTGAAT GGCTGCGGTG CGCTGATCGC CCGAGCTGGC 660 

GAAGGGCGCC ACAACCAGCG CGCGCTCCGT CTGCAGGCCC TTCCACGTGT CGTGGAGTTC 720 

CTGAACGAAC TCGGCCACCC GCTCGGGGCC CGTGGCCGCG CGCGCGGCCT GATAGCCGGC 780 

CGAGAGGCGC CGCCAGCGCG CCAGGAACTG ACTCATGTAA CAGAACCCGG GGACCTGGTC 840 

15 CCCCGACATC AACTTTGACG CCCTGGCGTG GATGCCCGAC ACGATGGCCA GGAACCCGTG 900 

GATTTCCCGC CGCACGACGG CCAGCACGTT ACCCTCGTGC GAGACCTGGG CCGCCAGCTC 960 

GTCGCATACC CCGAGGTGCG CCGTCGTCTC GGTGACGACG GACCGCAGCC CCGCGAGGGA 1020 

CGCGACCAGC GCGCGCTTGG CGTCGTGATA CATGCCGCAG TACTGGCTCA CCGCGTCGCC 1080 

CATGGCCTCG GGGCGCGAGG GCCCCAGGCG CTCGTGGGCG TCTGCGACCA CGGCGTACAG 1140 

20 GCGGTGCCCG TCGCTCTCGA ACCGGCACTC AAAGAAGGCG GCGAGCGTGC . GCATGTGCAG 1200 

CCGCAGCAGC ACGATCGCGT CCTCCAGCTG GCGGACCAGG GGGTCGGCGC GCTCGGCAAA 1260 

CTCCTGCATC ACCCCCCGGG CCGCCAGGGC GTACATGCTG ATCAGCAGCA GGCTGCTGCC 1320 

CACCTCGGGA GGCTGGGGGG GAGGCAGCTG GACCGCGGGC CGCAGCTGCT CGACGGCCCC 1380 

CCTGGCGATC ACGTACAGCT CGCGCAGCAG CTGCTCGATG TTGTCGGCCA TCTGCATCGT 1440 

25 GGGCCCGACG CCGGCCCGGG TGGCCGGTTC GAGGAGGGTG ATCAGCGCGC CCAATTTTGT 1500 

GCGGTGCCCC TCGACGGTGG GGAGATAGCC CAGGCCGAAG TCGCGCGCCC AGGCCAGCAC 1560 

CCGCAGGGCA AACTCGATGG GGCGGGGCAG GTAGGCAGCG TTGCACGTGG CCCTCAGCGG 1620 

GTCCCCGACC ACCAGGGCCA GCACGTAAGG GACGAACCCC GGGTCGGCGA GGACGTTGGG 1680 

GTGGATGCCC TCCAGGGCCG GGAAGCGGAT CTTGGTGGCC GCGGCCAGGT GAACCGAGGG 1740 

30 GGCGTGGCTA GGCGGCCCGA CGGGGAGCAG CGCGGACAGC GGCGTGGCCG GGGTGGTGGG 1800 

GGTCAGGTCC CAGTGGGTCT GGCCGTACAC GTCGAGCCAG ATGAGCGCCG TCTCGCGCAG 1860 

GAGGCTGGGC TGGCCGGCGC TGAAGCGGCG CTCGGCCGTC TCAAACTCCC CCACGAGCGT 1920 

GCGCCGCAGG CTCGCCAGGT GTTCCGTCGG CACGGCCGGG CCCATGATGC GCGCCAGCGT 1980 

CTGGCTGAGG ACGCCGCCCG ACAGGCCGAC CGCCTCACAG AGCCGCCCGT GCGTGTGCTC 2040 

35 GCTGGCGCCC TGGATCCGCC GGAACGTTTT CACGTAGCCG GCGTAGTGCC CGTACTCCCG 2100 

CGCGAGCCCG AACACGTTCG CCCCCGCAAG GGCAATGCAC CCAAAGAGCT GCTGGATCTC 2160 

GCTGAGCCCG TGGCCGGGGG GCGTCCGCGC GGGCACCCCC GCCACCAAAA ACCCCTCCAG 2220 

GGCCGATATG TACTGGGTGC AGTGCGCGGG CGTGAACCCC GCGTCGGTAA GCGTGTTGAT 2280 

CACCACGGAG GGCGAGTTGC TGTTTTGGAC CAAAGCCCAC GTCTGCTGCA GCAGCGCGAG 2340 

40 GAGCCGTTGC TGGGCCCCGG CGGAGGGCGG CTCCCCTAGC TGCAGCAGGC CGGTGACGGC 2400 

CGGACGGAAG ATGGCCAGCG CCGACGCACT CAGAAACGGC ACGTCGGGGT CGAAGACGGC 2460 

CGCGTCCGTC CGCACGCGCG CCATCAGCGT CCCCGGGGGC GCGCACGCCG ACCGCGGGCT 2520 

GACGCGGCTT AGGGCGGTCG ACACGCGCAC CTCCTCGCGA CTGCGAACCA TTTTGGTGGC 2580 
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CTCGAGGGGC 


GGGATCATGA 


TAGCCGGGTC 


GATCTCCCGC 


ACCGTGTGCT 


GAAACTGGGC 


2640 


CAGCAGCGGC 


GGCGGGACCA 


CCGCGCCCCG 


ATCGGGGGTC 


GTCAGGTACT 


CGTCCACCAG 


2700 


CGCCAGCGTA 


AACAGGGCCC 


GCGTGAGGGG 


GGTCAGGGCG 


GCGTCGTCGA 


TGCGCTGTAG 


2760 


GTGCGCCGAG 


AACAGCGTCA 


CCCAATTGCT 


GACCAGGGCC 


AAGAACCGGA 


GACCCTCTTG 


2820 


CACGATCGGG 


GACGGGAAGA 


GCAGGCTGTA 


CGCCGGGGTG 


GTCAGGTTGG 


CGCCGGGTTG 


2880 


CCCCAGGGGA 


ACCGGGGACA 


TCTTAAGCGA 


CATCTCCCCG AGGGCCTCCA GGGAGGTCCG 


2940 


CGGGTTCATG 


GCCAGGCAGC 


TCTGGGTGAC 


GGTCCGCCAG 


CGGTCGATCC 


ACTCCACGGC 


3000 


ACACTGGCGG 


ACGCGCACCG 


GCCCCAGGGC 


CGCCGTGGTG 


CGCAGCCCGG 


CGGCCTCCAG 


3060 


CGCGTGGGTC 


GTGTCGGAGC 


CGGTGATCGC 


CAGGACCGTG 


TCCTTGATGA 


CGTCCATCTC 


3120 


CCGGAAGGCC 


GCCTCGGGGG 


TCTCGGGGAG 


CGCCACCGCC 


ATGCGGTGCA 


CCAGCAGCCC 


3180 


GGGGAGGTTC 


TCGGCCAAGA 


GCGCCGTCTC 


CGGAAGCCCG 


TGGGCCCGGT 


GCAAGGCGCA 


3240 




nu\3/lvjLOu\j J. 




CCGCGCCTCC 


GCCGGGCCGA 


CCGCCGCGCC 




CGACAACAGA 


AACGCCGCCG 


TGGCGGCGCG 


CAGTTTGGCC 


GCGGACAGAA 


ACGCCGGCTC 


3360 


GTCCGCGCTG 


CCCGCCGGCT 


CGCTCGAGGG 


GGAGGGCGGC 


CGGCGGAGGT 


TGGTCAGGCT 


3420 


CCCCAACAGG 


ACCTGCAACG 


GTCCGTTTGG 


GGGTGGAGCG 


GACGGGGGGG 


TCATGCCGGC 


3480 


GGGCGCCGGG 


ACCTGGAGCG 


CGCTGTCCGA 


CATGGCGACC 


GGCGTGCGCG 


CTCGGCGACG 


3540 


CGGCGCGGAG 


ACCGCGGGCC 


CAAACGGGAA 


TGACTGCCGC 


CGCCCTATAC 


GGAGGGGGTA 


3600 


AGTATCGCCC 


GGGGACCCTT 


CGAAACCCCG 


GGCGTGTCGC 


AAGTACGCCC 


GCGAAAGGCG 


3660 


CGG 












3664 



20 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1043 amino acids 

25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



Pro Arg Leu Ser Arg Ala Tyr Leu Arg His Arg Phe Glu Gly Ser Pro 
15 10 15 

35 Gly Asp Thr Tyr Pro Leu Arg lie Gly Arg Arg Gin Ser Phe Pro Phe 
20 25 30 

Gly Pro Ala Val Ser Ala Pro Arg Arg Arg Ala Arg Thr Pro Val Ala 

35 40 45 

Met Ser Asp Ser Ala Leu Gin Val Pro Ala Pro Ala Gly Met Thr Pro 
40 50 55 60 

Pro Ser Ala Pro Pro Pro Asn Gly Pro Leu Gin Val Leu Leu Gly Ser 
65 70 75 80 

Leu Thr Asn Leu Arg Arg Pro Pro Ser Pro Ser Ser Glu Pro Ala Gly 
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85 



90 



95 



Ser Ala Asp Glu Pro Ala Phe Leu Ser Ala Ala Lys Leu Arg Ala Ala 

100 105 110 

Thr Ala Ala Phe Leu Leu Ser Gly Ala Ala Val Gly Pro Ala Glu Ala 

115 120 125 

Arg Ala Cys Trp His Pro Leu Leu Glu Gin Leu Cys Ala Leu His Arg 

130 135 140 

Ala His Gly Leu Pro Glu Thr Ala Leu Leu Ala Glu Asn Leu Pro Gly 



10 Leu Leu Val His Arg Met Ala Val Pro Glu Thr Pro Glu Ala Ala Phe 

165 170 175 

Arg Glu Met Asp Val lie Lys Asp Thr Val Leu Ala lie Thr Gly Ser 

180 185 190 

Asp Thr Thr His Ala Leu Glu Ala Ala Gly Leu Arg Thr Thr Ala Ala 
15 195 200 205 

Leu Gly Pro Val Arg Val Arg Gin Cys Ala Val Glu Trp lie Asp Arg 

210 215 220 

Trp Arg Thr Val Thr Gin Ser Cys Leu Ala Met Asn Pro Arg Thr Ser 
225 230 235 240 

20 Leu Glu Ala Leu Gly Glu Met Ser Leu Lys Met Ser Pro Val Pro Leu 

245 250 255 

Gly Gin Pro Gly Ala Asn Leu Thr Thr Pro Ala Tyr Ser Leu Leu Phe 

260 265 270 

Pro Ser Pro He Val Gin Glu Gly Leu Arg Phe Leu Ala Leu Val Ser 
25 275 280 285 

Asn Trp Val Thr Leu Phe Ser Ala His Leu Gin Arg He Asp Asp Ala 

290 295 300 

Ala Leu Thr Pro Leu Thr Arg Ala Leu Phe Thr Leu Ala Leu Val Asp 
305 310 315 320 

30 Glu Tyr Leu Thr Thr Pro Asp Arg Gly Ala Val Val Pro Pro Pro Leu 

. 325 330 335 

Leu Ala Gin Phe Gin His Thr Val Arg Glu He Asp Pro Ala He Met 

340 345 350 

He Pro Pro Leu Glu Ala Thr Lys Met Val Arg Ser Arg Glu Glu Val 
35 355 360 365 

Arg Val Ser Thr Ala Leu Ser Arg Val Ser Pro Arg Ser Ala Cys Ala 

370 375 380 

Pro Pro Gly Thr Leu Met Ala Arg Val Arg Thr Asp Ala Ala Val Phe 
385 390 395 400 

40 Asp Pro Asp Val Pro Phe Leu Ser Ala Ser Ala lie Phe Arg Pro Ala 



145 



150 



155 



160 



405 



410 



415 



Val 



Thr Gly Leu Leu Gin 



Leu Gly Glu Pro Pro 



Ser Ala Gly Ala 



Gin 



420 



425 
297 
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Gin Arg Leu Leu Ala Leu Leu Gin Gin Thr Trp Ala Leu Val Gin Asn 

435 440 445 

Ser Asn Ser Pro Ser Val Val lie Asn Thr Leu Thr Asp Ala Gly Phe 
450 455 460 

5 Thr Pro Ala His Cys Thr Gin Tyr He Ser Ala Leu Glu Gly Phe Leu 
465 470 475 480 

Val Ala Gly Val Pro Ala Arg Thr Pro Pro Gly His Gly Leu Ser Glu 

485 490 495 

He Gin Gin Leu Phe Gly Cys He Ala Gly Ala Asn Val Phe Gly Leu 
10 500 505 510 

Ala Arg Glu Tyr Gly His Tyr Ala Gly Tyr Val Lys Thr Phe Arg Arg 

515 520 525 

He Gin Gly Ala Ser Glu His Thr His Gly Arg Leu Cys Glu Ala Val 
530 535 540 

15 Gly Leu Ser Gly Gly Val Leu Ser Gin Thr Leu Ala Arg. He Met Gly 
545 550 . 555 560 

Pro Ala Val Pro Thr Glu His Leu Ala Ser Leu Arg Arg Thr Leu Val 

565 570 575 

Gly Glu Phe Glu Thr Ala Glu Arg Arg Phe Ser Ala Gly Gin Pro Ser 
20 580 585 590 

Leu Leu -Arg Glu Thr Ala Leu lie Trp Leu Asp Val Tyr Gly Gin Thr 

595 600 605 

His Trp Asp Leu Thr Pro Thr Thr Pro Ala Thr Pro Leu Ser Ala Leu 
610 615 620 

25 Leu Pro Val Gly Pro Pro Ser His Ala Pro Ser Val His Leu Ala Ala 
625 630 635 640 

Ala Thr Lys He Arg Phe Pro Ala Leu Glu Gly He His Pro Asn Val 

645 650 655' 

Leu Ala Asp Pro Gly Phe Val Pro Tyr Val Leu Ala Leu Val Val Gly 
30 660 665 670 

Asp Ala Leu Arg Ala Thr Cys Asn Ala Ala Tyr Leu Pro Arg Pro He 

675 680 685 

Glu Phe Ala Leu Arg Val Leu Ala Trp Ala Arg Asp Phe Gly Leu Gly 
690 695 700 

35 Tyr Leu Pro Thr Val Glu Gly His Arg Thr Lys Leu Gly Ala Leu He 
■705 710 715 720 

Thr Leu Leu Glu Pro Ala Thr Arg Ala Gly Val Gly Pro Thr Met Gin 

725 730 735 

Met Ala Asp Asn He Glu Gin Leu Leu Arg Glu Leu Tyr Val He Arg 
40 740 745 750 

Ala Val Glu Gin Leu Arg Pro Ala Val Gin Leu Pro Pro Pro Gin Pro 

755 760 765 

Pro Glu Val Gly Ser Ser Leu Leu Leu He Ser Met Tyr Ala Arg Val 
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770 775 780 

Met Gin Glu Phe Ala Glu Arg Ala Asp Pro Leu Val Arg Gin Leu Glu 
785 790 795 800 

Asp Ala lie Val Leu Leu Arg Leu His Met Arg Thr Leu Ala Ala Phe 
5 805 810 815 

Phe Glu Cys Arg Phe Glu Ser Asp Gly His Arg Leu Tyr Ala Val Val 

820 825 830 

Ala Asp Ala His Glu Arg Leu Gly Pro Trp Arg Pro Glu Ala Met Gly 
835 840 845 

10 Asp Ala Val Ser Gin Tyr Cys Gly Met Tyr His Asp Ala Lys Arg Ala 
850 855 860 

Leu Val Ala Ser Leu Ala Gly Leu Arg Ser Val Val Thr Glu Thr Thr 
865 870 875 880 

Ala His Leu Gly Val Cys Asp Glu Leu Ala Ala Gin Val Ser His Glu 
15 885 890 895 

Gly Asn Val Leu Ala Val Val Arg Arg Glu lie His Gly Phe Leu Ala 

900 905 910 

lie Val Ser Gly He His Ala Arg Ala Ser Lys Leu Met Ser Gly Asp 
915 920 925 

20 Gin Val Pro Gly Phe Cys Tyr Met Ser Gin Phe Leu Ala Arg Trp Arg 
930 935 940 

Arg Leu Ser Ala Gly Tyr Gin Ala Ala Arg Ala Ala Thr Gly Pro Glu 
945 950 955 960 

Arg Val Ala Glu Phe Val Gin Glu Leu His Asp Thr Trp Lys Gly Leu 
25 965 970 975 

Gin Thr Glu Arg Ala Leu Val Val Ala Pro Phe Ala Ser Ser Gly Asp 

980 985 990 

Gin Arg Thr Ala Ala He Gin Glu Val Met Ala His Ala Asn Glu Asp 
995 1000 1005 

30 Ala Pro Pro Ala Arg Pro Gin Thr Arg Arg Ala His Lys Arg His Asp 
1010 1015 1020 

Trp Gly Ala Gly Xaa Thr Xaa Xaa Gly Ala Trp Val Xaa Asp Val Val 
1025 1030 1035 104 

Xaa Asp Ser 

35 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS : 
40 (A) LENGTH: 5033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 



5 


CGCGGTCGAC 


TCTAGAGGAT 


CCCCTGCGCC 


GCGTCGGGAT 


TCACCAACTC 


GTTCGCGCGC 


60 




TGCAGGAGGT 


TCTTGCCCTC 


GCAGACCGTC 


ACGCGAATGG 


TGGTGAGGTC 


GAGGAGCTCG 


120 




TTGAGGTCTC 


CGTCGGTGTG 


CGGCCGCGAC 


ATGTCCCACA 


GCTGTACCGC 


CGCCAGCCGG 


180 




GCGTGCGTGG 


CCGCCAGGCG 


CCCGACCGCG 


GCGCAAAAAA 


CGCGCTTGTT 


GAACCCGGCC 


240 




ACCCGGGGGG 


TCCACGGCGC 


CGTGGGGCTC 


GGTGGGGCGG 


TGCTGAATTG 


CACCTCCTTG 


300 


10 


GCCAGTCCCT 


GGGCGGGTGT 


CTTGGTTCTT 


CCCGAGGCCG 


TGGGAGCGGG 


GGCGTCTAGG 


360 




AGCACGGCGG 


TATCGGCCTG 


GGCGGGTCGC 


CTGCCGCGGG 


CAGGGTCGGT 


CGCCGGGGTC 


420 




GCGGGGGCCT 


TAGGGCGCCC 


CGCGCGTCAT 


TTTGGGGGTC 


CGCGCGGGAG 


GGGCGTGCGA 


480 




GCGCCCGCCG 


GCGCCCACGG 


GGCCCCCGGG 


GGGTGGAGGA 


GCGCGCGCGG 


GGCCGGGGCC 


540 




GTGAGAGCCC 


GCGACGGACG 


CCGAACGACG 


CGGTCGCGCG 


GTATCCCGGG 


ACTCGTCGTT 


600 


15 


GTCTTCGGAC 


GACGACGAGT 


CCCGGTAGAG 


GGCATACCCA 


GCCTCGTCAT 


AATGGAGAAA 


660 




GCGAACCTCG 


CCCCTTGGGC 


GCGCGCGCAT 


CGGGCCAGCG 


CCGCGGCGGA 


AGTCGTCGCG 


720 




CGGACTCTCT 


GGATCCGCCG 


GGGAGACCGG 


GCCATAGTAC 


AGCTCCTCGT 


GGGTCCCGCG 


780 




CGGCGCTTCC 


CGCGGACACG 


ACTTGACGGA 


GCGGCGAGAG 


GTCATGGTCT 


ATCGGAGACA 


840 




CCGGGGACGC 


CCGTGCGGAT 


CACAGGGAAG 


GCGTCGGCGA 


AGCAGGCAGA 


GAGCGTCGGA 


900 


20 


AGGCGGCGAG 


GGAGGGAAAG 


AGGGAGACCG 


GCGGGGTACG 


GGAGAGCAGC 


GAGGGCCTGC 


960 




GTAACCCACG 


GGGGCCGCGG 


GAGTGGCTCC 


CTGCGGGTTG 


CGGGGGAGAG 


TTTATAGGAA 


1020 




GTGGATATAA 


CCGCAGGCGA 


CGGGACTAAC 


CAATCCCCGG 


GGGGGCAACG 


GACAGACACG 


1080 




CCCCGAACCG 


GCCCGACTTC 


CGCGAGGAAG 


CAAAGGCCGG 


GGGCCGCCCA 


ACGACACGCC 


1140 




CACCCCTTCC 


CAACAGGGCG 


GGCTCAGGCT 


GACCCGGCGG 


CCAGTGCCCG 


CTGGCATATC 


1200 


25 


TGATACACGT 


GCGCGATCAT 


ACATACGCCC 


ATCGAGGTCA 


TGCCTAGATA 


AAAGGGCACC 


1260 




AGGACCCCCG 


GGACGGACAC 


CACACCGGCG 


CTGTCGCCCC 


GGCATTGCGC 


GTCCCCGATA 


1320 




ACGCCGCGTG 


CGCCTGCCGC 


GTTCGGCGGC 


TCCCCGGGCA 


CGCCCGCGAC 


GAGCGCGACG 


1380 




AACAACAGCA 


CCACCCAGCG 


GCCCAGTCTT 


GCGGGTTTCC 


CCGTCATCGC 


GGCGATGAGT 


1440 




CAGTGGGGGC 


CCAGGGCGAT 


CCTTGTCCAG 


ACGGACAGCA 


CCAACCGGAA 


TGCCGATGGG 


1500 


30 


GACTGGCAAG 


CGGCCGTAGC 


TATTCGCGGG 


GGCGGAGTCG 


TTCAACTGAA 


CATGGTCAAC 


1560 




AAACGCGCCG 


TGGATTTTAC 


CCCGGCAGAA 


TGCGGGGACT 


CCGAATGGGC 


CGTGGGCCGC 


1620 




GTCTCTCTGG 


GCCTGCGAAT 


GGCAATGCCG 


CGTGACTTCT 


GCGCGATTAT 


TCACGCCCCC 


lOOU 




GCGGTATCCG 


GCCCCGGGCC 


CCACGTGATG 


CTCGGTCTCG 


TCCACTCGGG 


CTACCGCGGA 


1740 




ACCGTCCTGG 


CCGTGGTCGT 


ATCCCCGAAC 


GGGACGCGCG 


GGTTTGCCCC 


CGGGGCCCTC 


1800 


35 


CGGGTCGACG 


TGACGTTTCT 


GGACATCCGG 


GCCACCCCCC 


CGACCCTCAC 


CGAGCCGAGC 


1860 




TCCCTGCACC 


GGTTTCCGCA 


GTTGGCGCCG 


TCCCCGCTGG 


CAGGGTTACG 


AGAAGATCCT 


1920 




TGGTTGGACG 


GGGCGCTCGC 


GACCGCCGGG 


GGGGCGGTGG 


CCCTGCCGGC 


CAGACGGCGC 


1980 




GGGGGATCGC 


TGGTCTACGC 


GGGCGAGCTA 


ACGCAGGTGA 


CCACCGAGCA 


CGGCGACTGC 


2040 




GTGCACGAGG 


CGCCCGCCTT 


TCTGCCAAAG 


CGCGAGGAGG 


ACGCAGGCTT 


TGACATTCTC 


2100 


40 


ATCCACCGAG 


CCGTGACCGT 


CCCGGCCAAC 


GGCGCCACGG 


TCATACAGCC 


GTCCCTCCGC 


2160 




GTATTGCGCG 


CGGCCGACGG 


ACCAGAGGCC 


TGCTATGTGC 


TGGGGCGGTC 


GTCGCTCAAT 


2220 




GCCAGGGGCC 


TCCTGGTCAT 


GCCTACGCGC 


TGGCCCTCCG 


GGCACGCCTG 


TGCGTTTGTT 


2280 




GTATGTAACC 


TGACCGGAGT 


CCCGGTGACC 


CTACAAGCCG 


GGTCCAAGGT 


CGCCCAGCTG 


2340 
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CTCGTCGCGG GGACCCACGC CCTCCCCTGG 
GCATTCCGGG CCTACCCCAG AGGGGTTCCG 
ATTTTGGTGT TTACGAACGA GTTTGACGCG 
GGGTTTGGCT CCACTGGCAT CTAAACCGCG 
5 AATAAAGAGC TCTGTTTCGC ATATGCCCTG 
TGCCCGGCAC TCGGTTGTCC GTTCTGTCGT 
GAGTGGAACC GAAACCGGTC GACGTTTATT 
GGAGGGGGGC CTCGGTCGAC GAGGCCTGGC 
CCGGTGTAGG GTCCGCGGGG GGCACGGGCC 

10 GGGTGACTGG GACCGACGCA ACCTCCGGGG 
CGGTCGCTCC GAGCCCCGCG GTGCGGGTCC 
GGGCCCGCTC CGCGATCGCC ACGTCCTCCA 
TGGCCTGGGA GACGAGCACG TCCGCCGACT 
GCAGGATGGT GGCCATGCAC GTGTCCGCCA 

15 CGGCCCCGTC GATGGTGGAG CCCTCGAGTC 
TGACCATGCA GGCGTGGTAT GTGCGGGCCA 
CGTCCAGCGA CTCTAGGGCG TCATCAAGCG 
CCGCCAGGGC CTCCTGCAGC CGCGGCTCCG 
TCTCATATTG TTGTTCCTCG GGGCGCGTGC 

20 CCATCCCGGA ACACGCGCGC GGCTCTGCGC 
TCGCAGGGGC GATGGGGACG GAAGACTGCG 
TGGAGGTTAT GGCGCTGTAT GCGACCGACG 
TCACAAACTG CCTGCTGGGG GCCGAGCCGT 
CCGATGCGCC CAATGGCCCC ACGGGCGCGC 

25 GGGCGCTCTA CCGGGATGCG GGGGGGCTAA 
TATTGGGGAC GGAAGTGGGC GTGACCCACC 
TGTGCCGCTT CGAGCGAGCG GACGACGTCG 
CCCCATTGCT CCCGGCCCAC ATCACAGCAA 
ACGCTAACAT CATCATGGCT CTCACCGTGG 

30 GCAGCGGCAG CACCGCTCCC CTGTATGAGC 
GCATGTCCCT GGGGCAGCGC GGCCTCACCA 
TGGCGGCGTA CCGCCGGGCG TATTATGGGA 
AATTCGGCCC GGACAAAAAG AGCCTGGTGC 
CGCGCTTGGG GGGCGCCGGA GCCACGTACG 

35 CCTACGCGAT CCCCCACGAC CCACGCCCCG 
TCGCCGCCAT CACTCGGTTC TGTTGCACGA 
GGTTTCCGCT GTATGTGGAG CGCCGCATCG 
AGAAGTTCAT CGCCCACGAT CGCAGTTGCC 
ACATCTACCT GGCCCACTTT GAGTGCTTCA 

40 CCGTGACCAC CCACGACCCC AGCCCCGCGG 
GGGAGGCGGT GGAACAGTTC TTCCGGCACG 
TAAAGCAAAA CGTCACCCCC AGGGAAACCG 
TGCGCGCGCG CACGTATGCC CCGGCGGCCC 
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ATCCCCCCCG 


ACAACATCCA 


CGAGGACGGC 


2400 


GACGCGACCG 


CCACGCCCCG 


AGACCCGCCG 


2460 


GACGCCCCCC 


CAAGCAAGCG 


GGGGGCCGGG 


2520 


CCTCGCGTCG 


GGCCAGATGG 


GGCCCCGGTC 


2580 


GTGTTGGCGG 


TTTTTTTTTT 


GTTGTCTGTC 


2640 


CGCTATCACA 


TACGCACAAA 


CACACGGGTA 


2700 


CACCACACAG 


AAACACAAGC 


TAAGCGAGAA 


2760 


GTTTGGGGGC 


GGACGTGCGA 


TGACGTGGGT 


2820 


CGGGGCGAAC 


GGGGGATCTG 


TCGCCGGCGT 


2880 


CTTGTGCCCT 


CGTAGGCCCG 


GGGGGGGCCT 


2940 


CTCCGGCCAG 


AGCCGAGGTG 


GAGAGACCAA 


3000 


TGACCACGTC 


GCTTTCGGCC 


ATGCTCCGAA 


3060 


TGTCCGCGGC 


CCCCACCGAC 


ATGTACATCT 


3120 


GGCGGCGCAT 


CTTGTCCCGA 


TGCGCCGCAA 


3180 


CCGGGTGGTG 


GCGCGCCAGC 


CTCTCGAGGT 


3240 


GGGCGCGCGC 


CTTCACGAGG 


CGCCGGGTGT 


3300 


TGATGGGGGC 


GGGCAAAAGC 


GCATTGACCA 


3360 


CCTCCGAGGG 


CGGATCCGCG 


GCCCGAATCA 


3420 


CCCAACCGCA 


CAGCACCCCG 


AGCAGGGACG 


3480 


CGGCTTTCCC 


CCACCCCACC 


CCCTCCGGGT 


3540 


ATCACGAAGG 


GCGGTCGGTT 


GCGGCTCCCG 


3600 


GGTGCGTTAT 


CACCTCCTCG 


CTCGCCCTCC 


3660 


TGTATATATT 


CAGCTACGAC 


GCGTACCGGC 


3720 


CCACCGAACA 


GGAGAGGTTC 


GAGGGGAGCC 


3780 


ATGGCGATTC 


ATTTCGGGTG 


ACCTTTTGTT 


3840 


ACCCGAAAGG 


GCGCACCCGG 


CCCATGTTTG 


3900 


CCGTGCTCCA 


AGACGCCCTG 


GGCCGCGGGA 


3960 


CTCTGGACTT 


GGAGGCGACG 


TTTGCGCTCC 


4020 


CCATAGTCCA 


CAACGCCCCC 


GCCCGCATCG 


4080 


CCGGCGAATC 


GATGCGCTCG 


GTCGTCGGGC 


4140 


CGCTGTTCGT 


GCACCACAAG 


GCGCGCGTGC 


4200 


GCGCCCAAAG 


CCCCTTTTGG 


TTTCTGAGCA 


4260 


TGGCCGCTAG 


GTACTACCTA 


CTCCAGGCTC 


4320 


ATCTGCAGGC 


CGTGAAAGAC 


ATCTGCGCGA 


4380 


ACACCCTCAG 


TGCCGCGTCC 


TTGACCTCGT 


4440 


GCCAGTACTC 


CCGCGGGGCC 






CCGCCGACGT 


ACGCGAGACC 


GGCGCGCTGG 


4560 


TGCGCGTGTC 


CGACCGGGAA 


TTCATTACGT 


4620 


GCCCCCCGCG 


CCTGGCCACG 


CATCTCCGGG 


4680 


CCAGCACGGA 


GCAGCCCTCG 


CCCCTGGGTC 


4740 


TGCGCGCCCA 


GCTGAACATC 


CGCGAGTACG 


4800 


CCCTGGCGGG 


AGACGCGGCC 


GCCGCCTACC 


4860 


TCACGCCCGC 


CCCCGCGTAC 


TGCGGGGTCG 


4920 
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CAGACTCGTC CACCAAAATG ATGGGACGTC TGGCGGAAGC AGAAAGGCTC CTAGTCCCCC 4980 
ACGGCTGGCC CGCGTTCGCA CCAACAACCC CCGGGGACGA CGCGGGGGGC GG 5033 

(2) INFORMATION FOR SEQ ID NO: 88: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

15 

Val Leu Leu Asp Ala Pro Ala Pro Thr Ala Ser Gly Arg Thr Lys Thr 

15 10 15 

Pro Ala Gin Gly Leu Ala Lys Glu Val Gin Phe Ser Thr Ala Pro Pro 
20 25 30 

20 Ser Pro Thr Ala Pro Trp Thr Pro Arg Val Ala Gly Phe Asn Lys Arg 
35 40 45 

Val Phe Cys Ala Ala Val Gly Arg Leu Ala Ala Thr His Ala Arg Leu 

50 55 60 

Ala Ala Val Gin Leu Trp Asp Met Ser Arg Pro His Thr Asp Gly Asp 
25 65 70 75 80 

Leu Asn Glu Leu Leu Asp Leu Thr Thr lie Arg Val Thr Val Cys Glu 

85 90 95 

Gly Lys Asn Leu Leu Gin Arg Ala Asn Glu Leu Val Asn Pro Asp Ala 
100 105 110 

30 Ala Gin Gly He Leu 
115 



(2) INFORMATION FOR SEQ ID NO: 89: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

302 



WO 98/20016 



PCT/US97/20016 



Met Thr Ser Arg Arg Ser Val Lys Ser Cys Pro Arg Glu Ala Pro Arg 

1 5 10 15 

Gly Thr His Glu Glu Leu Tyr Tyr Gly Pro Val Ser Pro Ala Asp Pro 
5 20 25 30 

Glu Ser Pro Arg Asp Asp Phe Arg Arg Gly Ala Gly Pro Met Arg Ala 

35 40 .45 

Arg Pro Arg Gly Glu Val Arg Phe Leu His Tyr Asp Glu Ala Gly Tyr 
50 55 60 

10 Ala Leu Tyr Arg Asp Ser Ser Ser Ser Glu Asp Asn Asp Glu Ser Arg 
65 70 75 80 

Asp Thr Ala Arg Pro Arg Arg Ser Ala Ser Val Ala Gly Ser His Gly 

85 90 95 

Pro Gly Pro Ala Arg Ala Pro Pro Pro Pro Gly Gly Pro Val Gly Ala 
15 100 105 110 

Gly Gly Arg Ser His. Ala Pro Pro Ala Arg Thr Pro Lys Met Thr Arg 

115 120 125 

Gly Ala Pro 
130 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 90: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 363 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:90: 



Met Ser Gin Trp Gly Pro Arg Ala lie Leu Val Gin Thr Asp Ser Thr 
15 10 15 

35 Asn Arg Asn Ala Asp Gly Asp Trp Gin Ala Ala Val Ala lie Arg Gly 
20 25 30 

Gly Gly Val Val Gin Leu Asn Met Val . Asn Lys Arg Ala Val Asp Phe 

35 40 45 

Thr Pro Ala Glu Cys Gly Asp Ser Glu Trp Ala Val Gly Arg Val Ser 
40 50 55 60 

Leu Gly Leu Arg Met Ala Met Pro Arg Asp Phe Cys Ala He He His 
65 . 70 75 80 

Ala Pro Ala Val Ser Gly Pro Gly Pro His Val Met Leu Gly Leu Val 
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85 90 95 

His Ser Gly Tyr Arg Gly Thr Val Leu Ala Val Val Val Ser Pro Asn 

100 105 110 

Gly Thr Arg Gly Phe Ala Pro Gly Ala Leu Arg Val Asp Val Thr Phe 
5 115 120 125 

Leu Asp He Arg Ala Thr Pro Pro Thr Leu Thr Glu Pro Ser Ser Leu 

130 135 140 

His Arg Phe Pro Gin Leu Ala Pro Ser Pro Leu Ala Gly Leu Arg Glu 
145 150 155 160 

10 Asp Pro Trp Leu Asp Gly Ala Thr Ala Gly Gly Ala Val Pro Ala Arg 

165 170 175 

Arg Arg Gly Gly Ser Leu Val Tyr Ala Gly Glu Leu Thr Gin Val Thr 

180 185 190 

Thr Glu His Gly Asp Cys Val His Glu Ala Pro Ala Phe Leu Pro Lys 
15 195 200 205 

Arg Glu Glu Asp Ala Gly Phe Asp He Leu He His Arg Ala Val Thr 

210 215 220 

Val Pro Ala Asn Gly Ala Thr Val He Gin Pro Ser Leu Arg Val Leu 
225 230 235 240 

20 Arg Ala Ala Asp Gly Pro Glu Ala Cys Tyr Val Leu Gly Arg Ser Ser 

245 250 255 

Leu Asn Arg Leu Leu Val Met Pro Thr Arg Trp Pro Ser Gly His Ala 

260 265 270 

Cys Ala Phe Val val Cys Asn Leu Thr Gly Val Pro Val Thr Leu Gin 
25 275 280 285 

Ala Gly Ser Lys Val Ala Gin Leu Leu Val Ala Gly Thr His Ala Leu 

290 295 300 

Pro Trp He Pro Pro Asp Asn He His Glu Asp Gly Ala Phe Arg Ala 
305 310 315 320 

30 Tyr Pro Arg Gly Val Pro Asp Ala Thr Ala Thr Pro Arg Asp Pro Pro 

325 330 335 

He Leu Val Phe Thr Asn Glu . Phe Asp Ala Asp Ala Pro Pro Ser Lys 

340 345 350 

Arg Gly Ala Gly Gly Phe Gly Ser Thr Gly He 
35 355 360 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 251 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

304 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

5 

Val Gly Trp Gly Lys Ala Gly Ala Glu Pro Arg Ala Cys Ser Gly Met 

15 10 15 

Ala Ser Leu Leu Gly Val Leu Cys Gly Trp Gly Trp Glu Glu Gin Gin 
20 25 30 

10 Tyr Glu Met He Arg Ala Ala Asp Pro Pro Ser Glu Ala Glu Pro Arg 
35 40 45 

Leu Gin Glu Ala Val Val Asn Ala Leu Leu Pro Ala Pro He Thr Leu 

50 55 60 

Asp Asp Ala Leu Glu Ser Leu Asp Asp Thr Arg Arg Leu Val Lys Ala 
15 65 70 75 80 

Arg Ala Arg Thr Tyr His Ala Cys Met Val Asn Leu Glu Arg Leu Ala 

85 90 95 

Arg His His Pro Gly Leu Glu Gly Ser Thr He Asp Gly Ala Val Ala 
100 105 110 

20 Ala His Arg Asp Lys Met Arg Arg Leu Ala Asp Thr Cys Met Ala Thr 
115 120 125 

He Leu Gin Met Tyr Met Ser Val Gly Ala Ala Asp Lys Ser Ala Asp 

130 135 140 

Val Leu Val Ser Gin Ala He Arg Ser Met Ala Glu Ser Asp Val Val 
25 145 150 155 160 

Met Glu Asp Val Ala He Ala Glu Arg Ala Leu Gly Leu Ser Thr Ser 

165 170 175 

Ala Gly Gly Thr Arg Thr Ala Gly Leu Gly Ala Thr Glu Ala Pro Pro 
180 185 190 

30 Gly Pro Thr Arg Ala Gin Ala Pro Glu Val Ala Ser Val Pro Val Thr 
195 200 205 

His Ala Gly Asp Arg Ser Pro Val Arg Pro Gly Pro Val Pro Pro Ala 

210 215 220 

Asp Pro Thr Pro Asp Pro Arg His Arg Thr Ser Ala Pro Lys Arg Gin 
35 225 230 235 240 

Ala Ser Ser Thr Glu Ala Pro Leu Leu Leu Ala 
245 250 



(2) INFORMATION FOR SEQ ID NO: 92: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 710 amino acids 

(B) TYPE: amino acid 

305 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Val Thr Gly Thr Asp Ala Thr Ser Gly Ala Cys Ala Leu Val Gly Pro 
15 10 15 

10 Gly Gly Ala Ser Val Ala Pro Ser Pro Ala Val Arg Val Pro Pro Ala 
20 25 30 

Arg Ala Glu Val Glu Arg Pro Arg Ala Arg Ser Ala lie Ala Thr Ser 

35 40 45 

Ser Met Thr Thr Ser Leu Ser Ala Met Leu Arg Met Ala Trp Glu Thr 
15 50 55 60 

Ser Thr Ser Ala Asp Leu Ser Ala Ala Pro Thr Asp Met Tyr lie Cys 
65 70 75 80 

Arg Met Val Ala Met His Val Ser Ala Arg Arg Arg lie Leu Ser Arg 
85 90 95 

20 Cys Ala Ala Thr Ala Pro Ser Met Val Glu Pro Ser Ser Pro Gly Trp 
100 105 110 

Trp Arg Ala Ser Leu Ser Arg Leu Thr Met Gin Ala Trp Tyr Val Arg 

115 120 125 

Ala Arg Ala Arg Ala Phe Thr Arg Arg Arg Val Ser Ser Ser Asp Ser 
25 130 135 140 

Arg Ala Ser Ser Ser Val Met Gly Ala Gly Lys Ser Ala Leu Thr Thr 
145 150 155 160 

Ala Arg Ala Ser Cys Ser Arg Gly Ser Ala Ser Glu Gly Gly Ser Ala 
165 170 175 

30 Ala Arg He He Ser Tyr Cys Cys Ser Ser Gly Arg Val Pro Gin Pro 
180 185 190 

His Ser Thr Pro Ser Arg Asp Ala He Pro Glu His Arg Ser Ala Pro 

195 200 205 

Ala Phe Pro His Pro Thr Pro Ser Gly Phe Ala Gly Ala Met Gly Thr 
35 210 215 220 

Glu Asp Cys Asp His Glu Gly Arg Ser Val Ala Ala Pro Val Glu Val 
225 230 235 240 

Met Ala Leu Tyr Ala Thr Asp Gly Cys Val He Thr Ser Ser Leu Ala 
245 250 255 

40 Leu Leu Thr Asn Cys Leu Leu Gly Ala Glu Pro Leu Tyr He Phe Ser 
260 265 270 

Tyr Asp Ala Tyr Arg Pro Asp Ala Pro Asn Gly Pro Thr Gly Ala Pro 
275 280 285 
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Thr Glu Gin Glu Arg Phe Glu Gly Ser Arg Ala Leu Tyr Arg Asp Ala 

290 295- 300 

Gly Gin Gly Asp Ser Phe Arg Val Thr Phe Cys Leu Leu Gly Thr Glu 
. 305 310 315 320 

5 Val Gly Val Thr His His Pro Lys Gly Arg Trp Met Phe Val Cys Arg 

325 330 335 

Phe Glu Arg Ala Asp Asp Val Ala Val Leu Gin Asp Ala Leu Gly Arg 

340 345 350 

Gly Thr Pro Leu Leu Pro Ala His lie Thr Ala Thr Leu Asp Leu Glu 
10 355 360 365 

Ala Thr Phe Ala Leu His Ala Asn lie lie Met Ala Leu Thr Val Ala 

370 375 380 

lie Val His Asn Ala Pro Ala Arg lie Gly Ser Gly Ser Thr Ala Pro 
385 390 395 400 

15 Leu Tyr Glu Pro Gly Glu Ser Met Arg Ser Val Val Gly Arg Met Ser 

405 410 415 

Leu Gly Gin Arg Gly Leu Thr Thr Leu Phe Val His His Lys Ala Arg 

420 425 430 

Val Leu Ala Ala Tyr Arg Arg Ala Tyr Tyr Gly Ser Ala Gin Ser Pro 
20 435 440 445 

Phe Trp Phe Leu Ser Lys Phe Gly Pro Asp Lys Lys Ser Leu Val Leu 

450 455 460 

Ala Ala Arg Tyr Tyr Leu Leu Gin Ala Pro Arg Leu Gly Gly Ala Gly 
465 470 475 480 

25 Ala Thr Tyr Asp Leu Gin Ala Val Lys Asp lie Cys Ala Thr Tyr Ala 

485 490 495 

lie Pro His Asp Pro Arg Pro Asp Thr Leu Ser Ala Ala Ser Leu Thr 

500 505 510 

Ser Phe Ala Ala lie Thr Arg Phe Cys Cys Thr Ser Gin Tyr Ser Arg 
30 515 520 525 

Gly Ala Ala Ala Ala Gly Phe Pro Leu Tyr Val Glu Arg Arg lie Ala 

530 535 540 

Ala Asp Val Arg Glu Thr Gly Ala Leu Glu Lys Phe lie. Ala His Asp. 
545 550 555 560 

35 Arg Ser Cys Leu Arg Val Ser Asp Arg Glu Phe He Thr Tyr He Tyr 

565 570 575 

Leu Ala His Phe Glu Cys Phe Ser Pro Pro Arg Leu Ala Thr His Leu 

580 585 590 

Arg Ala Val Thr Thr His Asp Pro Ser Pro Ala Ala Ser Thr Glu Gin 
40 595 600 605 

Pro Ser Pro Leu Gly Arg Glu Ala Val Glu Gin Phe Phe Arg His Val 

610 615 620 

Arg Ala Gin Leu Asn He Arg Glu Tyr Val Lys Gin Asn Val Thr Pro 
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625 630 635 640 

Arg Glu Thr Ala Gly Asp Ala Ala Ala Ala Tyr Leu Arg Ala Arg Thr 

645 - 650 655 

Tyr Ala Pro Ala Ala Leu Thr Pro Ala Pro Ala Tyr Cys Gly Val Ala 
5 660 665 670 

Asp Ser Ser Thr Lys Met Met Gly Arg Leu Ala Glu Ala Glu Arg Leu 

675 680 685 

Leu Val Pro His Gly Trp Pro Ala Phe Ala Pro Thr Thr Pro Gly Asp 
690 695 700 

10 Asp Ala Gly Gly Gly He 
705 710 

(2) INFORMATION FOR SEQ ID NO: 93: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5742 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO:93: 



AGAGGAGTAG AATAGAGAAG AGGATAGAGA GGAGTAGAGT GATCAATAAG ATGTGAAATA 60 

25 TGAGAGAGTA GTAAAGTAGT AAAGAATTTG GGACGGAGCG TAGACAGATA GATATAGAGA 120 

TATGGCCGTC TAGGAAGAAG AAATGTGTGA GAGATATAAA GTGGGTAAGA GGTCTATATG 180 

AAGAGTAACG AGTAAGGGAT GGGTAGAAGA AGCCTGATGG GGAAGGTGAG AAGAAGTGTT 240 

AAAGGGGATA GAATGGAGGT TAGCGAGGTG GTAGAAACAA GAAGGGGAAT AGGAAGGAAC 300 

GGCCAATAGG AGAAAGAAGA GGAATGATGG AGGGAAGATG AGGTAGGAGC CATCCCGGCC 360 

30 CACATTTACG GAAAACAGAC CAACGTGCAG GTCGCGACGG AGTTCGATAT GGAAGTAGAA 420 

GTTCTCCGCG GCGCGGTCCC AAATCGGCAC CAGCAGGGAA GCATTTACAA AAGCGTACCG 480 

GGTCGGAGGC CCGCCGCCCT GGTACGTGTA CGTGTACAAC CCGCACGTCT TTGGGGACGG 540 

CGGCTCGCCG CCGCGACACC CTCCATACTG CCGAAGGAGG ATGGGCTGAC GGCAGGCGCG 600 

GGTGAGCCGG TATTACGTCA CACGGGCCGC ATACGTTGCG TTGGGTGCGC CAAATCCGCG 660 

35 TACCGGCGGC GCCAGCAAAA CCAGCCCCCC CGTGAGGAAC GAGCGGCCCA GGGGCTCGTG 720 

ACGGACGACG CGCCGGGGCA GGTCTTGCCG CCCGGCGTCC GCGGAACCCA AGGGCCCGGG 780 

ATCGTCCAAC CGGGGATAGG CATAACAATA TTGAGCCACG GGGGAACCTC CCGGGAAAAC 840 

AACATCGTTG TTGGGGGGAT TAATTGGTCC GGGGACACCC GACCCGCCGC GTGTCCCCGG 900 

AAGACCAGAA AGAACAAAAA GAAGAAGCAA CCTGGGAGCG ATGGCGTGCA TGACGCCGGG 960 

40 CGGCAAGGTG CCAAAAACAC CAAAGCCCGA GGGCCGCGTC TTTTTGTGCA AAAACATCCA 1020 

ACCAGCCCCC CCTCCACGCC CCCCGGGGGG AGCGGGGTCA CTTAGGGTGA AATAGCGGCA 1080 

GGCGCAGCAA CTCCGCGGCG CTTGGGCGGA GCGCCGCGTC AAAGGTAAGG GCTTTGCAGA 1140 

TGAGATATTC GACGTCTGTG TGGATCTTGT AGTAGCGGGT CCATGCCGGT CGGGTCCACG 1200 
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CCGGACGATT 


GTTCCCGGCC 


GCCCGCGAGC 


GGTAGTGCGC 


GGTGAGGCGC 


GATTCCGCGT 


1260 


GCGTTGGAAA 


CTCGTCGACG 


TGTACCTGGG 


CCTGTCGGAT 


GATGCGCGCG 


ATCTGGTTGT 


1320 


CGCACGGCCG 


CCTTTCGGGG 


TCGCGCGGGG 


CCGAGAACAA 


GGACGCGGTG 


TGGACGGCGG 


1380 


TCTCAAAGAT 


CACCAGGCCG 


GCGCTCCAGA 


TGTCGATTAC 


CTGGGTGTAC 


GGATCCCCGG 


1440 


CCAGGACCTC 


GGGGGCGTTT 


GTATCGATGG 


TGCCTGCGAT 


CCCGTAATGG 


AAGGGGCTCG 


1500 


ATCGACACCC 


GCGCACAAAG 


CACGCCGCCC 


CAAAGTCCCC 


CAGACAGATG 


TTCTCGGGGG 


1560 


TGTTGATGAG 


GATGTTCTCG 


GTCTTAATAT 


CGCGGTGGAT 


GATGCCTTCG 


CAGTGGACGT 


1620 


AGTCGATGGC 


GCTCAAGAGC 


TGCCGGGAGA 


CCGCGGTTAT 


CTGTAGGTGG 


CCCAACGGAG 


1680 


ACGGGCGCTT 


GCTCAGATAG 


GTATACAGGT 


CGCAGTGATA 


CTTGGGGAGG 


ACCAGACACG 


1740 


TGACCCCAGA 


AACGACGTGC 


AGGTCCAGGA 


GGGGTAGGAT 


CGCGGGGTGG 


TTCAGGCGTC 


1800 


TCAGCAGCCG 


CGCCTCGTGG 


TTTGTGCTGG 


CGTACCACCC 


CGCCTTGACG 


ATTACCCGAT 


1860 


GAGGGTAGTT 


CGGGTGGCTG 


CTATCAAAGA 


CACACCCCTC 


CGACCCCGGG 


ATGAGCGTTC 


1920 


CGTGGATCGC 


GAATCCCAGC 


CCCGTCACCA 


GTTTTGCCAG 


GGTCGAAGGG 


GGCTTGCACC 


1980 


CGCGGCTGAT 


GGCCCGAAGT 


GCCTCCCGGT 


CCATGGTGTC 


GAGCTCTTCC 


GGGGTGAACC 


2040 


CCGTGGCCCC 


CATCTTACCG 


CTGTCGCAGG 


TCCGGACGTC 


GGGGGGGGCT 


GCGCGGCCCG 


2100 


GAACAGGAGG 


ATGGCCGCTG 


GCTCCGGGCA 


GGGGGGCGGC 


CGAAACCATG 


GACAGAAAAC 


2160 


GCCCCTCCGC 


GTAGTCCTCC 


GGGTAGGCCA 


CGTCATCCGG 


GGCGTCATCG 


TCGGCCTCGT 


2220 


CTTCCTCCTC 


CGCACCCGCG 


GCGTCCACGA 


TGGGGTAGTC 


CTCGTCGCTG 


TGCATCTGGG 


2280 


CCAAGATCTC 


CTGCAGCTGA 


CACAGGCGCG 


CAGCCTCGCC 


GGGGGACGGT 


GGGCGGGAAG 


2340 


GGTGGATGGT 


TTCCGGGGGC 


CCGGGGGCCA 


GGTACGCATC 


CTCCGCGGGG 


GTATAAAAGG 


2400 


TGCTCGCCGG 


GAAGGCCGGG 


GCCGTGTTTG 


TCTCCGGCGG 


GACGGACGCC 


TCCTGTCTCT 


2460 


TGTCGGGTCT 


ACGGTAGACC 


CCACAGAACT 


TACGACAGGC 


CATTCGCCGC 


GTCGCGCGTG 


2520 


CCAACCAACG 


AGCACCCCGA 


GCGACGGGCC 


CCGGTGTTTT 


AAGAAGCGGC 


AGTTTGTCGA 


2580 


CACACCCCCC 


CACTACCCCC 


GCCCCCTATA 


TCCGGAACGT 


CAGATTATCC 


GGGATACCTA 


2640 


GCCAACCAAA 


CAAGGCTGAA 


AAAATCGAAC 


GTGCGAACGG 


GCCGTGTGAT 


AGCAAGCAGC 


2700 


CCCCCCGGGT 


CCGCGCGCCG 


TCCCGCCGTG 


CATAGGTCCG 


CAGACAGGCG 


AGTGAGTGAA 


2760 


GATCGGACCA 


CGGGCCTAAT 


ATACCGACAT 


GGGCGTTGTT 


GTTGTAAGTG 


TGGTTACCCT 


2820 


CCTAAACCAA 


CGAAACGCCC 


TGCCGCGGAC 


TTCCGCTGAC 


GCAAGCCCGG 


CTCTGTGGAG 


2880 


TTTTCTGCTT 


CGGCAATGCC 


GGATCCTGGC 


CTCCGAGCCT 


CTGGGAACCC 


CGGTGGTGGT 


2940 


TCGCCCGGCG 


AACCTTCGCA 


GGCTGGCCGA 


GCCTCTGATG 


GACTTGCCCA 


AATTCACCCG 


3000 


ACCGATCGTG 


CGAACCCGCT 


CCTGTCGCTG 


TCCCCCAAAC 


ACCACGACGG 


GCCTGTTTGC 


3060 


GGAGGACGAC 


CCCCTGGAAA 


GCATCGAGAT 


TCTGGATGCC 


CCTGCGTGTT 


TTCGGCTCCT 


3120 


GCATCAAGAG 


CGCCCCGGCC 


CCCACCGGCT 


ATACCACCTG 


TGGGTGGTCG 


GGGCGGCGGA 


3180 


CCTGTGTGTG 


CCGTTTTTAG 


AGTACGCACA 


AAAAACCCGG 


CTGGGGTTTC 


GCTTCATCGC 


3240 


CATGAAGACC 


AACGACGCGT 


GGGTGGGGGA 


ACCGTGGCCC 


CTGCCCGATC 


GGTTTTTGCC 


3300 


CGAGCGGACC 


GTGTCGTGGA 


CCCCGTTCCC 


CGCAGCGCCT 


AATCACCCCC 


TGGGAAAATC 


3360 


TCCTTAGCCG 


ATACGAATAC 


CAATACGGCG 


TGGTGGTGCC 


CGGCGACCGG 


GAACGCAGCT 


3420 


GTCTTCGCTG 


GCTACGGTCC 


CTCGTGGCTC 


CTCACAACAA 


ACCCCGCCCC 


GCATCATCCC 


3480 


GCCCGCATCC 


GGCGACCCAC 


CCCACGCAGC 


GCCCGTGTTT 


TACGTGCATG 


GGGCGACCCG 


3540 


AGATTCCCGA 


TGAGCCCTCC 


TGGCAGACGG 


GGGACGATGA 


CCCCCAGAAC 


CCCGGGCCCC 


3600 


CGCTGGCCGT 


TGGCGACGAG 


TGGCCTCCGT 


CATCCCACGT 


TTGCTATCCA 


ATCACCAACC 


3660 


TCTAACCCCC 


CCCCGATGCT 


AATAAAAAAC 


ACTGCGCCCC 


ATTACACGTA 


CGAGCGGTGT 


3720 


CGCGTTTGTT 


TCTTTTTTTG 


TCGTTCCTTC 


CTCCACCCCC 


AGAAAAACCA 


GACACTCAGA 


3780 
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CACAAAAGCT 


TTCTTGTAGG 


GCGTTTATTT 


TCGTTTGGCA 


AACACACCGG 


GCTGGGGGTC 


3840 


CCGCGCTCAT 


GGCCGGAGAA 


ATGTGTGGCC 


GCAGGGATAG 


GGGCAGGCGG 


CGGGGAAGCG 




CATTTTTCGG 


CACCCGTCCT 


CGCGTTCTGG 


GGCTTCCGTT 


GCGCGACACG 


CGCCCGGGGC 


3960 


GGGTAGGCCA 


GAGCGTTCGG 


CGGAGCTGGT 


ATCGGCGACC 


ACCGCGGACA GCCAGGGCTG 


4020 


GGAGCCCTCC 


TCGGACGTCC 


AGTCAAACGT 


CCCAAACTCA 


C AC I C C A\jC L. 


/-y tv r~>c^ r^r^r^c 7\ t 1 


4080 


AGCCCCGCCG 


GACGCGGATT 


CCGGGTTCTC 


CCGGCCGGCC 


GGGGAGGGCC 


CCCCL ICLbl 


4140 


GTCGGAATGG 


GACGCGAGGG 


TATCGTCCGA 


GTCCGTCGCA 


TCCGATATCT 




4200 


CGTCGAGCCG 


CCGTTGGTCT 


CGAGATTGCC 


AACATCACAC 


TCGGGGCCGT 


AACGCCTGGC 


4260 


GTTCAGACAG 


GGCAGCTCGC 


ACACGGGCTC 


GGCGGCGGGT 


TCAAAACGCG 


A-V rtmO/^ TV /TAT Ml)**' 

CCTCGACCTC 


4320 


CCGGATTGCG 


TTTCGCAGGC 


GCACGTCCCA 


GGTTCCGCCC 


GAGATCTGCA 


GCAGGCGGCC 


4380 


CCACGTGCGC 


GGCCCCAGGC 


GGGTCCGGCA 


GTAGCCCATA 


AGGTAGCAGT 


CTCGCACCAG 


4440 


GTGGCGCAGG 


CGGTTGGCGC 


TGCCGGGCGG 


GTTCGGGGCG 


GCACGCAGGA 


m/~»r^/~»TV TV TV /*-» TV /"> 

TCCGAAAGAG 


4500 


CTGGTTGACG 


CACTGGCGTA 


TGTAATGCAG 


ATC AAGCGTC 


CAGGCGGCTC 


CGCTCCGCAT 


4560 


CTGGGCCTGG 


CGAGCAGAGC 


GCCGCGGTGC 


GGCTGCGCGA 


TCCCCGGAAG 


ACTGGGCCGG 


4620 


GGCCTGGGGT 


TGCGCCGCGC 


GGATAGGTCT 


GTCGTTTCTC 


CACACCTCGG 


TV TV TV TV TV 

GGAAAACCAC 


4680 


ACCCGCGCGC 


CGGTCGGGGG 


AGCTCGTTAA 


TCGCAGGTTG 


ATTCGGGGTC 


GCTTGGATTT 


4740 


ACGGGGGGGG 


GTGTCAACCA 


ACCAGCCGTC 


TGACGCATCA 


TCATCATCGT 


CGTCCGACGG 


4800 


CCCCGTTCCC 


TCCGATTCCG 


TCCCCGTGGT 


CGATTCTGCC 


/*> TV /"*» TV /*"• 7A TV 

GACAGATCCA 


TV TV TV TV TV m TV /""'/"* m 

AAAAATACCT 


4860 


CCCCCCGAGC 


TCCCGCGGGG 


AGCGACGGCG 


CCCGCCGCGT 


AGGTCTCCCG 


CCTCATCCTC 


4920 


GGACGACTCG 


GTCGAGGAGG 


ATTCCGATTC 


TGTGTCGGGC 


mm tv /"»/-»/"> m/"* tv /-• 

TTACCCTCAG 


jv rnrrr/^i /*^»^ tv r*r> tv 

ATTCCGACGA 


4980 


GCTGGGGAGG 


ACGGGGCGTC 


TGCGCTTCCG 


TGAACCCGGG 


GGTGGGGATG 


GGGG AGC ATG 


5040 


ATTCGCAGGC 


GTCGTGTTGA 


CCGCGGGCGG 


GTCCGGGGGG 


ATGTCTGCCA 


TAGTGGCGAC 


5100 


GCCTTGTCGC 


CAGTTACCAC 


ACCGGTGTCC 


CGTCCACGAA 


GGCGGCGCCC 


GGCCTGCGAT 


5160 


AAAGCGCGGA 


TGTTGGGATC 


GGGGCCCCCC 


CCCCCCGTCT 


CCCTTTTCCC 


CTCTCTTCTC 


5220 


TCTCCCTCCT 


CCCTCCCCCC 


CTCCTCTGTC 


TCTCTCCCCT 


TTTTCCCTCA 


CCTCCCCCTC 


5280 


TTCTCCTCCC 


TCCTCCTCCT 


TCCCTTTCCC 


CTCCCCTCCC 


TCTCTCTTCC 


TCCTTCTCCT 


5340 


CCCCTCTTTC 


TTCTTCTTCC 


CTCTCTCCCT 


TTCCCCCACT 


CTCATTCTTC 


CCACTTCGCT 


5400 


CCCTTTTCTC 


TCTCTTCCCT 


TCTTTTCCTT 


TTCCCTACCC 


TCTCCCTTCT 


TCTTCCGTCT 


5460 


CCCCTCCCCT 


TCTCTCCTCT 


CTCTCTCCTC 


GTCTTTGTAT 


CCACGCTACC 


TCCTCTTCAT 


5520 


CTCATCTCTT 


TCTTCCTCTC 


TTCTCCTCCC 


TCTCCCCTCT 


TTTATCTCCC 


CATTCCTTAC 


5580 


TCTCTCCTTA 


TCTCTACCTT 


TATCTCAACA 


GCTCTCTCAC 


GCTCCTCCCA 


TGGCCATCTC 


5640 


CTCTCCTTCC 


TCTCCCCTCC 


TCTCACATCT 


CATCCTCTCT 


TCTCCTTCCA 


TCTCATCTCC 


5700 


CATCTCCTGC 


CCCCACGCTT 


CTCCCCCTCT 


CTCTTCTCAC 


T 




5742 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 507 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Val Gly Gly Cys Val Asp Lys Leu Pro Leu Leu Lys Thr Pro Gly Pro 

1.5 10 15 

Val Arg Ala Arg Trp Leu Ala Arg Ala Thr Arg Arg Met Ala Cys Arg 

20 25 30 

Lys Phe Cys Gly Val Tyr Arg Arg Pro Asp Lys Arg Gin Glu Ala Ser 

35 40 45 

Val Pro Pro Glu Thr Asn Thr Ala Pro Ala Phe Pro Ala Ser Thr Phe 

50 55 60 

Tyr Thr Pro Ala Glu Asp Ala Tyr Leu Ala Pro Gly Pro Pro Glu Thr 
65 70 75 80 

lie His Pro Ser Arg Pro Pro Ser Pro Gly Glu Ala Ala Arg Leu Cys 

85 90 95 

Gin Leu Gin Glu lie Leu Ala Gin Met His Ser Asp Glu Asp Tyr Pro 

100 105 110 

He Val Asp Ala Ala Gly Ala Glu Glu Glu Asp Glu Ala Asp Asp Asp 

115 120 125 

Ala Pro Asp Asp Val Ala Tyr Pro Glu Asp Tyr Ala Glu Gly Arg Phe 

130 135 140 

Leu Ser Met Val Ser Ala Ala Pro Leu Pro Gly Ala Ser Gly His Pro 
145 150 155 160 

Pro Val Pro Gly Arg Ala Ala Pro Pro Asp Val Arg Thr Cys Asp Ser 

165 170 175 

Gly Lys Met Gly Ala Thr Gly Phe Thr Pro Glu Glu Leu Asp Thr Met 

180 185 190 

Asp Arg Glu Ala Leu Arg Ala He Ser Arg Gly Cys Lys Pro Pro Ser 

195 200 205 

Thr Leu Ala Lys Leu Val Thr Gly Leu Gly Phe Ala He His Gly Thr 

210 215 220 

Leu He Pro Gly Ser Glu Gly Cys Val Phe Asp Ser Ser His Pro Asn 
225 230 235 240 

Tyr Pro His Arg Val He Val Lys Ala Gly Trp Tyr Ala Ser Thr Asn 

245 250 255 

His Glu Ala Arg Leu Leu Arg Arg Leu Asn His Pro Ala He Leu Pro 

260 265 270 

Leu Leu Asp Leu His Val Val Ser Gly Val Thr Cys Leu Val Leu Pro 

275 280 285 

Lys Tyr His Cys Asp Leu Tyr Thr Tyr Leu Ser Lys Arg Pro Ser Pro 

290 295 300 

Leu Gly His Leu Gin He Thr Ala Val Ser Arg Gin Leu Leu Ser Ala 
305 310 315 320 
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He Asp Tyr Val His Cys Glu Gly He He His Arg Asp He Lys Thr 

325 330 335 

Glu Asn He Leu He Asn Thr Pro Glu Asn lie Cys Leu Gly Asp Phe 
340 345 350 

5 Gly Ala Ala Cys Phe Val Arg Gly Cys Arg Ser Ser Pro Phe His Tyr 
355 360 365 

Gly He Ala Gly Thr He Asp Thr Asn Ala Pro Glu Val Leu Ala Gly 

370 375 380 

Asp Pro Tyr Thr Gin Val He Asp He Trp Ser Ala Gly Leu Val He 
10 385 390 395 400 

Phe Glu Thr Ala Val His Thr Ala Ser Leu Phe Ser Ala Pro Arg Asp 

405 410 415 

Pro Glu Arg Arg Pro Cys Asp Asn Gin He Ala Arg lie He Arg Gin 
420 425 430 

15 Ala Gin Val His Val Asp Glu Phe Pro Thr His Ala Glu Ser Arg Leu 
435 440 445 

Thr Ala His Tyr Arg Ser Arg Ala Ala Gly Asn Asn Arg Pro Ala Trp 

450 455 460 

Trp Ala Trp Thr Arg Tyr Tyr Lys He His Thr Asp Val Glu Tyr Leu 
20 465 470 475 480 

lie Cys Lys Ala Leu Thr Phe Asp Ala Ala Leu Arg Pro Ser Ala Ala 

485 490 495 

Glu Leu Leu Arg Leu Pro Leu Phe His Pro Lys 
500 505 



25 



35 



(2) INFORMATION FOR SEQ ID NO: 95: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 188 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



Met Gly Val Val Val Val Ser Val Val Thr Leu Leu Asn Gin Arg Asn 
15 10 15 

40 Ala Leu Pro Arg Thr Ser Ala Asp Asp Ala Leu Trp Ser Phe Leu Leu 
20 25 30 

Arg Gin Cys Arg lie Leu Ala Ser Glu Pro Leu Gly Thr Pro Val Val 
35 40 45 
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Val Arg Pro Ala Asn Leu Arg Arg Leu Ala Glu Pro Leu Met Asp Leu 

50 55 60 

Pro Lys Phe Trp lie Val Arg Thr Arg Ser Cys Arg Cys Pro Pro Asn 
65 70 75 80 

5 Thr Thr Thr Gly Leu Phe Ala Glu Asp Asp Pro Leu Glu Ser lie Glu 

85 90 95 

lie Leu Asp Ala Pro Ala Cys Phe Arg Leu Leu His Gin Glu Arg Pro 

100 105 110 

Gly Pro His Arg Leu Tyr His Leu Trp Val Val Gly Ala Ala Asp Leu 
10 115 120 125 

Cys Val Pro Phe Leu Glu Tyr Ala Gin Lys Thr Arg Leu Gly Phe Arg 

130 135 140 

Phe lie Ala Met Lys Thr Asn Asp Ala Trp Val Gly Glu Pro Trp Pro 
145 150 155 ' 160 

15 Leu Pro Asp Arg Phe Leu Pro Glu Arg Thr Val Ser Trp Thr Pro Phe 

165 170 175 

Pro Ala Ala Pro Asn His Pro Leu Gly Lys Ser Pro 
180 185 

20 (2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Met Gly Arg Pro Glu lie Pro Asp Glu Pro Ser Trp Gin Thr Gly Asp 

15 10 15 

Asp Asp Pro Gin Asn Pro Gly Pro Pro Leu Ala Val Gly Asp Glu Trp 
35 20 25 30 

Pro Pro Ser Ser His Val Cys Tyr Pro lie Thr Asn Leu 
35 40 45 



40 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: peptide 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Val Gly Arg Met Arg Val Gly Glu Arg Glu Arg Gly Lys Lys Lys Lys 
15 10 15 

10 Glu Gly Arg Arg Arg Arg Lys Arg Glu Gly Gly Glu Gly Lys Gly Lys 
20 25 30 

Glu Glu Glu Gly Gly Glu Glu Gly Glu Val Arg Glu Lys Gly Glu Arg 

35 40 45 

Asp Arg Gly Gly Gly Glu Gly Gly Gly Arg Glu Lys Arg Gly Glu Lys 
15 50 55 60 

Gly Asp Gly Gly Gly Gly Pro Arg Ser Gin His Pro Arg Phe He Ala 
65 70 75 80 

Gly Arg Ala Pro Pro Ser Trp Thr Gly His Arg Cys Gly Asn Trp Arg 
85 90 95 

20 Gin Gly Val Ala Thr Met Ala Asp He Pro Pro Asp Pro Pro Ala Val 
100 105 HO 

Asn Thr Thr Pro Ala Asn His Ala Pro Pro Ser Pro Pro Pro Gly Ser 

115 120 125 

Arg Lys Arg Arg Arg Pro Val Leu Pro Ser Ser Ser Glu Ser Glu Gly 
25 130 135 140 

Lys Pro Asp Thr Glu Ser Glu Ser Ser Ser Thr Glu Ser Ser Glu Asp 
145 150 155 160 

Glu Ala Gly Asp Leu Arg Gly Gly Arg Arg Arg Ser Pro Arg Glu Leu 
165 170 175 

30 Gly Gly Arg Tyr Phe Leu Asp Leu Ser Ala Glu Ser Thr Thr Gly Thr 
180 185 190 

Glu Ser Glu Gly Thr Gly Pro Ser Asp Asp Asp Asp Asp Asp Ala Ser 

195 200 205 

Asp Gly Trp Leu Val Asp Thr Pro Pro Arg Lys Ser Lys Arg Pro Axg 
35 210 215 220 

He Asn Leu Arg Leu Thr Ser Ser Pro Asp Arg Arg Ala Gly Val Val 
225 230 235 240 

Phe Pro Glu Val Trp Arg Asn Asp Arg Pro He Arg Ala Ala Gin Pro 
245 250 255 

40 Gin Ala Pro Ala Gin Ser Ser Gly Asp Arg Ala Ala Ala Pro Arg Arg 
260 265 270 

Ser Ala Arg Gin Ala Gin Met Arg Ser Gly Ala Ala Trp Thr Leu Asp 
275 280 285 
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Leu His Tyr lie Arg Gin Cys Val Asn Gin Leu Phe Arg lie Leu Arg 

290 295 300 

Ala Ala Pro Asn Pro Pro Gly Ser Ala Asn Arg Leu Arg His Leu Val 
305 310 " . 315 320 

5 Arg Asp Cys Tyr Leu Met Gly Tyr Cys Arg Thr Arg Leu Gly Pro Arg 

325 330 335 

Thr Trp Gly Arg Leu Leu Gin lie Ser Gly Gly Thr Trp Asp Val Arg 

340 345 350 

Leu Arg Asn Ala lie Arg Glu Val Glu Ala Arg Phe Glu Pro Ala Ala 
10 355 360 365 

Glu Pro Val Cys Glu Leu Pro Cys Leu Asn Ala Arg Arg Tyr Gly Pro 

370 375 380 

Glu Cys Asp Val Gly Asn Leu Glu Thr Asn Gly Gly Ser Thr Ser Asp 
385 390 395 400 

15 Asp Glu lie Ser Asp Ala Thr Asp Ser Asp Asp Thr Leu Ala Ser His 

405 410 415 

Ser Asp Thr Glu Gly Gly Pro Ser Pro Ala Gly Arg Glu Asn Pro Glu 

420 425 430 

Ser Ala Ser Gly Gly Ala lie Ala Ala Arg Leu Glu Cys Glu Phe Gly 
20 435 440 445 

Thr Phe Asp Trp Thr Ser Glu Glu Gly Ser Gin Pro Trp Leu Ser Ala 

450 455 460 

Val Val Ala Asp Thr Ser Ser Ala Glu Arg Ser Gly Leu Pro Ala Pro 
465 470 475 480 

25 Gly Ala Cys Arg Ala Thr Glu Ala Pro Glu Arg Glu Asp Gly Cys Arg 

485 490 495 

Lys Met Arg Phe Pro Ala Ala Cys Pro Tyr Pro Cys Gly His Thr Phe 
500 505 510 

Leu Arg Pro 
30 515 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 6328 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

TGCGGTCGAC TCTAGAAGAC CCTGTGCACG GGACTCGGTT GGGCGACGTC TGCGGTCTAN 
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TGGTCGGCGT 


GGGGACCGGC 


TGTGTGGTGG 


GTGGGGGAAG 


CACGTTGGAC 


TACAACCGAC 


120 


CACACGACAT 


GCAGGCTTCT GAGCCGCGAA 


TCCCCACCAT 


GGCGTTGGGG 


GCTGGGCATG 


180 


CCCACGCATG 


CAGGGATGAC 


GGAGATGATA 


GCGTGATTGA 


CGCCCCGCCC 


CCATACGAAA 


240 


TTGTGGCCGG 


CGCGAGCGCG 


GGCCANTTTG 


TCGTTATTGA 


TATCGACACC 


CCCACGGACT 


300 


CGCCTCCACC 


GTACTCTGCA 


GGGACGTCTC 


CCGTTGGGCT 


TGTTTCACCG 


GCTTCTTCCG 


360 


GTGACGGCGA 


GGTGTGTGAG 


CGTGGCCGCT 


CGCGCCGCGC 


CGCCTGGCGG 


GCCGCTCGGC 


420 


GCGCCAGGCG 


CCGCGCCGAA CGACGGGCGC 


GGCGCCGGAG 


CTTTGGCCCA 


GGGGGGTTGT 


480 


TTGTGGAGAC 


CCCCCTGTTT 


CTACCGGAAA 


CTATGATTGG 


GGCCCACCCT 


GGCGTGGGAG 


540 


GCGACCTCCC 


GTCGGGCCTC 


CCTACTTACG 


CAAAGGCGAC 


CTCGGATCGC 


CCCCCCACCT 


600 


ACGCCATGGT 


CATGGCCGCA TGTCCGACCG 


AGCCACCGGG 


CGGGTCCGTG 


GGGCCGGCCG 


660 


ACCAACCCCG 


CGTGCAAAGC 


TCGCGCACGT 


GGCGACCCCC 


GCTCGTCAAT 


TCGCGAGAGC 


720 


TGTACCGGGC 


CCAACGCGCG 


GCCCGCTGTG 


CGTCAAGCTC 


CGACACGCCC 


CAAGCCCCAG 


780 


GGTGGTGTGG 


CGGGACGTGT 


CGTCATGCGG 


TTTTTGGGGT 


GGTCGCGGTG 


GTCGTCGTTA 


840 


TCATCTTGGC 


CTTCCTTTGG 


CGGTAAGCTT 


CCCCCCTCCC 


GCGATACAAC 


GAATAAAAGT 


900 


CGCGTTAACA 


CACACGCTGG 


TTCGTCGCGT 


GGTATTTACC 


GGGTTCCTAT 


AACCCACAAA 


960 


CTCACACCGC 


GTCTGTTTTG 


GTTGGTTCTC 


ACTCTTTATT 


AATGAGGTTG 


CATACGGACT 


1020 


CGGAGGGAAG 


GGGGTGGGTT 


ATACCTTGAT 


TTTGATTTTG 


ATTTTGTGGC 


GTCGCTGTTC 


1080 


TGCCGCGCGA 


GCGGCCGTGC 


CGCTTGAGCT 


TATAAAGCGA 


AGGTGTTGTA 


GGGCCGCGGA 


1140 


TGCCCCGAGC 


CAGCAGGTTT 


TGGAGAACGG 


ATACCGACAG 


TGACAGTGGT 


ACACGATACC 


1200 


GTTTATCGTG 


TATTCCCCCG 


CGGGGGATGC 


ACCGGAGACG 


TCCTTGATGG 


TTCCGCAGCT 


1260 


GAATGGGCCC 


GTGCACCGCA 


GCCGCCCCCC 


ACGCTTATCC 


TCTAGTTCGC 


GCAAGACGGG 


1320 


CGGGTGGTTG 


ACCAGGTCCG 


CGAAGTTGCG 


GAGCTCGGTA 


ACCAGCGGAA 


GGGAGTGCTG 


1380 


CGAGTCCTTG 


TACACCGCAA 


AGAAAAAACA 


GTGGATGGGG 


CCGCTGGTCG 


AGCAGGAGGC 


1440 


GCGGACTAAA 


TAACTCCGCG 


CGGCCAGGCA 


CGCGGGGTCC 


GTTTTGGTCG 


TGTGCAGGGC 


1500 


GTTTTCGGCC 


TGCCACGTGG 


CATTCAGACA 


GTACGGGGGG 


GCGACGTGGG 


TGATGTCCGG 


1560 


GGCCCGTAAA 


AACAGGTTCG 


AGAGGGGCGT 


CGTTGTCATG 


TTGCGAGGGG 


GGGGGTCGGT 


1620 


GAATACGCGT 


CTGCCGTGAG 


TTCCTGGACG 


CGCTATGAAG 


CGGGCCGGGC 


CGGGGCCCCA 


1680 


CATTTATCCG 


GTGGGTCATC 


GCCCTCCTCC 


CACGCGCACG 


CCGGCATCGC 


CCCGGAGTCT 


1740 


CCGCCCCACC 


CGCCGCGCGC 


GCCAAGAACA 


TCACACGGAA 


CCACTTGGGT 


TGACGTCAAT 


1800 


ATGTTTATTC 


TTGCCTAAAA 


TAGGGAGTTG 


CAGTAGAAGT 


ATTTGCCGTG 


CACATATAAG 


1860 


GGGGCGATAG 


TGTGACTGGC 


CGTCAGCTCG 


CACACGCGAC 


TGGAACACTC 


CTGGCGGTGC 


1920 


GTGTCCAGTA 


TTTCAATGAG 


ACCCGCCATG 


CAGGCCCCCG 


GGATGTAAAA 


GTGCATCGTC 


1980 


TCGCCGGCCC 


CAACCCCCAC 


GGTCGTGTAG 


TCGATCTCCG 


ACACGCCGCG 


CTCGACGCGG 


2040 


TTGGCGAGGC 


GGGCCAGGAT 


GACCAACACA 


AAGGAGGCAA 


TATCCTTAAT 


GTCCGACAGG 


2100 


CGTCGCCGCG 


AGCACAGGTC 


GTCCAGCCCG 


CACAGGCCTC 


GGGCCTTCAG 


GTAGCACTGC 


2160 


AGAAAGGGGC 


GCAGGCGCGT GGCGAGGTTT 


TCCAGCACGG 


CGGCCGCCGT 


TCCGATGATA 


2220 


GGGTCCTGGG 


GGCGGAGCGG CAGATTGTGG 


TGAATGCACA 


TCTTGCACCA 


CGCCAGCGTC 


2280 


TCATCCGCGG 


ACGCCAGGGC 


CTCGATGAGA 


TTTTCCTGGC 


GCAGCACGCA 


GTCGCGCATG 


2340 


GCCTTGGCTG 


TCGACGCGGC 


CCGCGGGTTG 


GCTGCGAATG 


TGCGGTAGAG 


GCTCGGGCCG 


2400 


TGAGCGACCA 


GGGTTTCCCA GGAAACCCGA 


CGGGTCTCGG 


CGTCAAACCC 


CCCCGCTTGG 


2460 


GTGGCCAGCA 


CGGGAGCCCA GGGGCTGTTC 


GCGGCGGGAA 


ACGGCATCCC 


GCCAAAGGGG 


2520 


TCTTGCATGA 


CCAGGGCACT 


GCGTCCAAAG 


CTTTCGCTGA 


TGCGCTCGAC 


CGCCGCGCGC 


2580 


TCGGATATGG 


ATCGCAGAAC 


CGCCCGAACG 


GCGGGGTCGA 


TGGTGTCGGC 


AGAGGGCGCC 


2640 
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TTTCGCTCCG 


GGACCGGGGC 


GCGGCCGTCC 


GCGTGCGGGG 


GGGTCAGGGA 


CAGCGCCATC 


2700 


AGCGGAGGGG 


GGGCCTGGCG 


CGTGCCTCGT 


GGCCGCGGGG 


CCCCCGGAGA 


CGTCGAGCTG 


2760 


CTCCCCTCCG 


CGCCGCGCCT 


CGCCGTGGGT 


GGCGCCGGGG 


CCGTCCGTCC 


GCGCCGACGC 


2820 


GGGGTGGCAA 


CCCCCTTGGT 


TGTGGGCGTT 


TCTGGAGACG 


CGCCGGCGGG 


GGTTTGGTGT 


2880 


GGAGTCGGCG 


CCGCCGGGGC 


CGTATTGACC 


CCGGCCCCGG 


CGGCGACCTC 


GCCGCCCGCC 


2940 


TCGGGGATGC 


GGTGCCTTGG 


TCGACGGGGG 


TTGGATGCGG 


GCCACCTTCC 


CCCCGTGCGG 


3000 


TTCCCGGGGG 


GAAGCCGACC 


GCCTGGTCCC 


GAGGCGCGAC 


CACACGCCGG 


TGGTCGCGGG 


3060 


TGGCGGATCG 


TCGGCTCCCC 


GCCGCGCTGC 


CGGGCGAGGC 


GTCAAGGCTT 


CGGGGGTGCC 


3120 


GGCGTCCTCG 


GGGCGGGCCG 


GGGGACCTTT 


GGGAATCGCC 


GCGTAGATGG 


CCTCCGCCCC 


3180 


TCCGTCTCCG 


CAGGGGTCTT 


CCATGTCCTC 


GTCCGACAAG 


GAACACTCCC 


CGCTGCTGTC 


3240 


GGACTCGGGG 


TCGTCGCGGC 


GGCCCTCCTC 


GTCCCGCTCC 


AGAGCGTCCT 


CCTCGAGCTC 


3300 


GCTGTCGGAC 


AGGTCCAATC 


CTAGGTCGAT 


TAGCATATCA 


ATGTCGGTAG 


CCATGTTGTA 


3360 


GGTCGCCGGG 


GCTGGGATGG 


CGGGTGTCCT 


CCGAGGGGGC 


GCGTGTCGGA 


AGAGAGTGGC 


3420 


CGGGTCCGAA 


TCGAGGAGCG 


GCACCGACGC 


GCAACCGGGG 


TCGGCACACG 


GCAGCACACA 


3480 


GCGCCCAGTG 


GGCCGGTGGC 


TGCCCTTATA 


CCCGCACGAC 


CGGGGCCGGC 


TTTCCGAAAC 


3540 


TCCTCCTTGT 


CCCTCCCCGT 


CGGGCGTCAC 


CGCCCCCGCC 


CCCGCCGTCC 


CCAGAAACCA 


3600 


ATCGGACGCC 


GAGGGTGGGT 


TTTATGTATT 


TAATTAGCAT 


ACGGCAGGTC 


TGGGTCCGCC 


3660 


TTCGCGTACA 


CGCGTAGGCG 


GGGGTGCGGA 


AGCACGCGGT 


AGGGTGGGGT 


GTATGCGGAA 


3720 


GTCGGACGAG 


CCTGCCTGTG 


CTGGACCGGG 


GGAGGGGCAA 


GCAGACCCGA 


GGCCGGATCG 


3780 


GCTCTGTGCA 


CGATTTTAAT 


TTGCATGCGA 


CGTGCGAGGG 


TGCGTAGGCC 


CGAGGCGGGT 


3840 


CGTGTATTTA 


ATTTGCATGG 


GCGACTGGGT 


CCGCCTCTTC 


CAACGGAAGA 


GGCGTTACGT 


3900 


CACAGATCAA 


ACAGGCGCCG 


CTGAATCTCC 


TGTTCGTAGC 


GAAGCGCCAT 


AAGCACCACC 


3960 


CCGGCCACGA 


CGGCGATATA 


GCACAGGCGC 


ACGGCGATAC 


CGGAGAGGAT 


GATGGAACAG 


4020 


CAGCGCCCGC 


AGACGCCCGA 


CCACCCTTTG 


GAGCGCCCCC 


TGGGGGCCGC 


TGGTTCCGCG 


4080 


TTTTTGGGGG 


CCGAGTCCCG 


CCGCAGGATG 


AAATACAGCT 


CCGTCAGGGC 


GATGATGGAC 


4140 


ACGAAACACC 


AGGTGGTGAT 


TGTTAGAAAC 


AGGGGGTATG 


TGATCGCGCA 


GGCGCCCCGG 


4200 


GAGATGAGAG 


CGGTGCCGAC 


GATGAGACCG 


AGGGCCACGA 


AGCGGAGCAG 


CAGCTCGCAG 


4260 


CCCACGATGA 


CGCCAACGGC 


CGGGCGGTGG 


TACAAGAAGG 


TGACCGGATC 


CGCCTCGAAC 


4320 


AGCTGCACCA 


GGGTCTGGCG 


TTGAACGGAT 


AGCTCGCAGA 


GGAGGCGGGT 


GATTTTCGTG 


4380 


TAGGGGTATT 


GCAAGAACAC 


GCTCGACACT 


ATGCGGCCGG 


CGTAGTTCAA 


AAGATAGGTC 


4440 


GCCGGGGCCA 


CCATGCTGTG 


CGCGGGACTC 


ACGACGCCGA 


ACATGCATCG 


TCGTTGGTGA 


4500 


AGGGCGACGA 


ACGCTAGATA 


CAGAAACCAA 


CCGACGACCA 


CCAGGCGCAT 


CTGGGTGTCC 


4560 


CAGAGGGCCT 


CCAAGCAGTT 


TACGGCCTCG 


TGCACGTTCA 


TGACCCGGCG 


GCTCATGGCG 


4620 


CCGGGGATGG 


CCGGGAGGGA 


CACGGCCCGA 


CCTTCGATGA 


TATTGGCGTA 


GCAGACGTGG 


4680 


GCGTGGGGGG 


TCCATGCCCC 


GCCGGGGGGG 


GCGGTCGGCG 


GGCCCAGAAA 


CAACAGCGTC 


4740 


TGGTTTATCT 


TCATCCACAC 


GAGGGCGGTA 


TCGTTGTGTG 


CCCCGGCGGG 


GCGCACCGCG 


4800 


TAAATACATC 


GGTGGAGCGG 


ACTGGCACCA 


AAGACGATGT 


ACCACGCGAG 


CACGAGGCCG 


4860 


TAGGCCGTTA 


TGAAGATGAC 


GGTCGTGAGG 


TGCTGGAGGG 


AGCGGACCGC 


GAGCATGGCG 


4920 


TGCCCGCATC 


GACGGTAAAC 


AGCGTGTGCA 


GGCGGTTGTT 


ATCGCATTTA 


GTGGCAAAGC 


4980 


ACTGCTGACA 


CAGGGACGCG 


CATAGGCGGT 


TGTTGGTCCC 


GACGCTCAGC 


GCGAGAAAGG 


5040 


TCCGGGCCGT 


CGCGCGACTT 


GCCCTGCCGT 


GCTTGAAGCG 


CAGACACGAG 


AGGCTTTGGT 


5100 


TCAGGGCGCC 


GCGGCCGGGG 


ATCAGCTGCA 


GCAGGACCCA 


GTCGTCCTTA 


ATGACGGCCC 


5160 


GGCGAACGGA 


CACGGCCTGA 


TATTCCCGCG 


CGTGCTCGGG 


AAAGTGGGCC 


TCGATGCACG 


5220 
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10 



15 



20 



25 



30 



35 



CGACTATTCG 


CCCCAGCAGT 


TCTGACGCGA 


ACCCCTCCAC 


GGTCTCCCCG 


GCGTCGGGCC 


5280 


GCACATCGTA 


GCGGCCCAGA 


ACCTCCGTCA 


GGGTCTCGCG 


TCGCCCAAAG 


TGCTCCAGGG 


5340 


CGTTGCGCGA 


CGCCTTCTTC 


TCGAAAAAGC 


TGACATAGTC 


CCCGCCCAGG 


CTGTGGAGGA 


5400 


CGCGGATCTC 


CCGCGGGGCC 


GCGGAAAACA 


TGGGCGGGGC 


GTGAAAGTGG 


AAGCGCCGCG 


5460 


GGTCGGCGTG 


CGCGGCGACG 


AACGCCGGAA 


CGTCCTCGCA 


CGCGGGGGGG 


ATCACGAAGA 


5520 


CGGGCAATAA 


CCGGCCGCAC 


GCGGAGCCGT 


CGGGGCCGAT 


TTTGGCGAAA 


TACGGCAAGC 


5580 


GCAGGCTGTG 


GCCGTGGGCG 


TACACGCCCG 


TATCGATCAG 


CAAAAAGTTC 


TTTACGTGGC 


5640 


TCCCTACGGC 


CTCCACGAAG 


TTGCGGTCCA 


ACAGCACCGC 


CTGCTGGATC 


ACCCTCGCCA 


5700 


CCCCACGCAT 


TGTCAGGGAG 


CCGTGCACAA 


CGTACGGGGC 


GGGGACCGGT 


AGGCACACGC 


5760 


GCAGCCCGAT 


CTTGTCGGCG 


CAGGAACACA 


CGACGGAGTC 


CGGTTCGCTG 


GGCGTCGCCG 


5820 


CTGGTATCTG 


TTCGTGTAGC 


AGGTCGAGGT 


ACGCGGCCTC 


GTCGTCCGGG 


AGGGGGCCGT 


5880 


GGGTCGTGTC 


CATGGGGTCC 


GTGTCCTCCT 


CCCACTCCTC 


GTCGCCGTCG 


TCGCCACCGG 


5940 


CGTCGGGGAA 


CCAGTCCCCG 


TCGCCGTCGT 


CGCCACCGGC 


CGAGGGCCCG 


TCGCCCGCAC 


6000 


AGACGGGCGG 


CGCGCGGGGC 


CGACAGGCGC 


TTTTGAAAAA 


ATAACAGGGA TAGGCGTCGG 


6060 


GGTTTACGCG 


GGCCGCGGGA 


AACAACAGNT 


GAACCGCCGC 


CAGCGCCCCG 


CGCATAAAGT 


6120 


GACCCAGGGC 


CTCGTGGAGC 


CGGGGAAAGG 


GGACGGGCTC 


CTTCAGGGCG 


ATGTCCAGAT 


6180 


CCAGGATGAT 


GTTCGTAACG 


GCCAGCGCGG 


CGTTGAAGAT 


CTCGTTGCGG 


TTCACGTACA 


6240 


TATGGCCCCC 


GACGGCCAGG 


CCGCCCGGGG 


GGAGCGCGGG 


CCCCGGCGCA 


AAAGGGCGGT 


6300 


GACCGGGGAC 


TTCTAGATCG 


AATGCAG 








6328 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Met Val Met Ala Ala Cys Pro Thr Glu Pro Pro Gly Gly Ser Val Gly 

15 10 15 

Pro Ala Asp Gin Pro Arg Val Gin Ser Ser Arg Thr Trp Arg Pro Pro 



20 



25 



30 



Leu Val Asn Ser Arg Glu Leu Tyr Arg Ala 



Gin Arg Ala Ala Arg Cys 
45 

Gly Trp Cys Gly Gly Thr 
60 



35 40 



Ala Ser Ser Ser Asp Thr Pro Gin Ala Pro 



50 55 



Cys Arg His Ala Val Phe Gly Val Val Ala 



Val Val Val Val He He 



65 



70 



75 



80 



Leu Ala Phe Leu Trp Arg 
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85 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE. DESCRIPTION: SEQ ID NO: 100: 

15 Met Trp Gly Pro Gly Pro Ala Arg Phe lie Ala Arg Pro Gly Thr His 
15 10 15 

Gly Arg Arg Val Phe Thr Asp Pro Pro Pro Arg Asn Met Thr Thr Thr 

20 25 30 

Pro Leu Ser Asn Leu Phe Leu Arg Ala Pro Asp lie Thr His Val Ala 
20 35 40 45 

Pro Pro Tyr Cys Leu Asn Ala Thr Trp Gin Ala Glu Asn Ala Leu His 

50 55 60 

Thr Thr Lys Thr Asp Pro Ala Cys Leu Ala Ala Arg Ser Tyr Leu Val 
65 70 75 80 

25 Arg Ala Ser Cys Ser Thr Ser Gly Pro lie His Cys Phe Phe Phe Ala 

85 90 95 

Val Tyr Lys Asp Ser Gin His Ser Leu Pro Leu Val Thr Glu Leu Arg 

100 105 110 

Asn Phe Ala Asp Leu Val Asn His Pro Pro Val Leu Arg Glu Leu Glu 
30 115 120 125 

Asp Lys Arg Gly Gly Arg Leu Arg Cys Thr Gly Pro Phe Ser Cys Gly 

130 135 140 

Thr He Lys Asp Val Ser Gly Asp Ala Gly Glu Tyr Thr He Asn Gly 
145 150 155 160 

35 He Val Tyr His Cys His Cys Arg Tyr Pro Phe Ser Lys Thr Cys Trp 

165 170 175 

Leu Gly Ala Ser Ala Ala Leu Gin His Leu Arg Phe He Ser Ser Ser 

180 185 190 

Gly Thr Ala Ala Arg Ala Ala Glu Gin Arg Arg His Lys He Lys He 
40 195 200 205 

Lys He Lys Val 
210 



319 
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(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 286 amino acids 
5 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 



Met Ala Leu Ser Leu Thr Pro Pro His Ala Asp Gly Arg Ala Pro Val 
1 5 10 15 

15 Pro Glu Arg Lys Ala Pro Ser Ala Asp Thr lie Asp Pro Ala Val Arg 
20 25 30 

Ala Val Leu Arg Ser He Ser Ala Ala Val Glu Arg He Ser Glu Ser 

35 40 45 

Phe Gly Arg Ser Ala Leu Val Met Gin Asp Pro Phe Gly Gly Met Pro 
20 50 55 60 

Phe Pro Ala Ala Asn Ser Pro Trp Ala Pro Val Leu Ala Thr Gin Ala 
65 70 75 80 

Gly Gly Phe Asp Ala Glu Thr Arg Arg Val Ser Trp Glu Thr Leu Val 
85 90 95 

25 Ala His Gly Pro Ser Leu Tyr Arg Thr Phe Ala Ala Asn Pro Arg Ala 
100 105 110 

Ala Ser Thr Ala Lys Ala Met Arg Asp Cys Val Leu Arg Gin Glu Asn 

115 120 125 

Leu He Glu Ala Ser Ala Asp Glu Thr Leu Ala Trp Cys Lys Met Cys 
30 130 135 140 

He His His Asn Leu Pro Leu Arg Pro Gin Asp Pro He He Gly Thr 
145 150 155 160 

Ala Ala Ala Val Leu Glu Asn Leu Ala Thr Arg Leu Arg Pro Phe Leu 
165 170 175 

35 Gin Cys Tyr Leu Lys Arg Leu Cys Gly Leu Asp Asp Leu Cys Ser Arg 
180 185 190 

Arg Arg Leu Ser Asp He Lys Asp He Ala Ser Phe Val Leu Val He 

195 200 205 

Leu Ala Arg Leu Ala Asn Arg Val Glu Arg Gly Val Ser Glu He Asp 
40 210 215 220 

Tyr Thr Thr Val Gly Val Gly Ala Gly Glu Thr Met His Phe Tyr He 
225 230 235 240 

Pro Gly Ala Cys Met Ala Gly Leu He Glu He Leu Asp Thr Gin Glu 

320 
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245 250 255 

Cys Ser Ser Arg Val Cys Glu Leu Thr Ala Ser His Thr lie Ala Pro 

260 265 270 

Leu Tyr Val His Gly Lys Tyr Phe Tyr Cys Asn Ser Leu Phe 
5 275 280 285 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Leu Ala Val Arg Ser Leu Gin His Leu Thr Thr Val lie Phe lie 
20 1 5 10 15 

Thr Ala Tyr Gly Leu Val Leu Ala Trp Tyr He Val Phe Gly Asp Leu 

20 25 30 

His Arg Cys He Tyr Ala Val Arg Pro Ala Gly Ala His Asn Asp Thr 
35 40 45 

25 Ala Leu Val Trp Met Lys He Asn Gin Thr Leu Leu Phe Leu Gly Pro 
50 55 60 

Pro Thr Ala Pro Pro Gly Gly Ala Trp Thr Pro His Ala His Val Cys 
65 70 75 80 

Tyr Ala Asn He He Glu Gly Arg Ala Val Ser Leu Pro Ala He Pro 
30 85 90 95 

Gly Ala Met Ser Arg Arg Val Met Asn Val His Glu Ala Val Asn Cys 

100 105 110 

Leu Glu Ala Leu Trp Asp Thr Gin Met. Arg Leu Val Val Val Gly Trp 
115 120 125 

35 Phe Leu Tyr Leu Ala Phe Val His Gin Arg Arg Cys Met Phe Gly Val 
130 135 140 

Val Ser Pro Ala His Ser Met Val Ala Pro Ala Thr Tyr Leu Leu Asn 
145 150 155 160 

Tyr Ala Gly Arg He Val Ser Ser Val Phe Leu Gin Tyr Pro Tyr Thr 
40 165 170 175 

Lys He Thr Arg Leu Leu Cys Glu Leu Ser Val Gin Arg Gin Thr Leu 

180 185 190 

Val Gin Leu Phe Glu Ala Asp Pro Val Thr Phe Leu Tyr His Arg Pro 

321 
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195 200 205 

Ala Val Gly Val lie Val Gly Cys Glu Leu Leu Leu Arg Phe Val Gly 

210 215 220 

Leu lie Val Gly Thr Ala Leu lie Ser Arg Gly Ala Cys Ala lie Thr 
5 225 230 235 240 

Tyr Pro Leu Phe Leu Thr lie Thr Thr Trp Cys Phe Val Ser lie lie 

245 250 255 

Ala Leu Thr Glu Leu Tyr Phe . lie Leu Arg Arg Asp Ser Ala Pro Lys 
260 265 270 

10 Asn Ala Glu Pro Ala Ala Pro Arg Gly Arg Ser Lys Gly Trp Ser Gly 
275 280 285 

Val Cys Gly Arg Cys Cys Ser lie lie Leu Ser Gly lie Ala Val Arg 

290 295 300 

Leu Cys Tyr He Ala Val Val Ala Gly Val Val Leu Met Ala Leu Arg 
15 305 310 315 320 

Tyr Glu Gin Glu He Gin Arg Arg Leu Phe Asp Leu 
325 330 



20 



(2) INFORMATION FOR SEQ ID NO: 103: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 482 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

30 

Ala Ala Phe Asp Leu Glu Val Pro Gly His Arg Pro Phe Ala Pro Gly 

1 5 10 15 

Pro Ala Leu Pro Pro Gly Gly Leu Ala Val Gly Gly His Met Tyr Val 
20 25 30 

35 Asn Arg Asn Glu He Phe Asn Ala Ala Val Thr Asn He He Leu Asp 
35 40 45 

Leu Asp He Ala Leu Lys Glu Pro Val Pro Phe Pro Arg Leu His Glu 

50 55 60 

Ala Leu Gly His Phe Met Arg Gly Ala Ala Val Xaa Leu Leu Phe Pro 
40 65 70 75 80 

Ala Ala Arg Val Asn Pro Asp Ala Tyr Pro Cys Tyr Phe Phe Lys Ser 

85 90 95 

Ala Cys Arg Pro Arg Ala Pro Pro Val Cys Ala Gly Asp Gly Pro Ser 

322 
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100 

Ala Gly Gly Asp Asp 
115 

Asp Asp Gly Asp Glu 
5 130 

Thr His Gly Pro Leu 
145 

His Glu Gin lie Pro 
165 

10 Cys Ser Cys Ala Asp 
180 

Ala Pro Tyr Val Val 
195 

Val lie Gin Gin Ala 
15 210 

Gly Ser His Val Lys 
225 

His Gly His Ser Leu 
245 

20 Gly Ser Ala Cys Gly 
260 

Cys Glu Asp Val Pro 
275 

Phe His Phe His Ala 
25 290 

Arg Val Leu His Ser 
305 

Lys Ala Ser Arg Asn 
325 

30 Thr Glu Val Leu Gly 
340 

Val Glu Gly Phe Ala 
355 

Glu Ala His Phe Pro 
35 370 

Arg Arg Ala Val He 
385 

Gly Arg Gly Ala Leu 
405 

40 Gly Arg Ala Ser Arg 
420 

Gly Thr Asn Asn Arg 
435 



105 

Gly Asp Gly Asp Trp Phe 
120 

Glu Trp Glu Glu Asp Thr 
135 

Pro Asp Asp Glu Ala Ala 
150 155 
Ala Ala Thr Pro Ser Glu 
170 

Lys He Gly Leu Arg Val 
185 

His Gly Ser Leu Thr Met 
200 

Val Leu Leu Asp Arg Asn 
215 

Asn Phe Leu Leu He Asp 
230 235 
Arg Leu Pro Tyr Phe Ala 
250 

Arg Leu Leu Pro Val Phe 
265 

Ala Phe Val Ala Ala His 
280 

Pro Pro Met Phe Ser Ala 
295 

Leu Gly Gly Asp Tyr Val 
310 315 
Ala Leu Glu His Phe Gly 
330 

Arg Tyr Asp Val Arg Pro 
345 

Ser Glu Leu Leu Gly Arg 
360 

Glu His Ala Arg Glu Tyr 
375 

Lys Asp Asp Trp Val Leu 
390 395 
Asn Gin Ser Leu Ser Cys 
410 

Ala Thr Ala Arg Thr Phe 
425 

Leu Cys Ala Ser Leu Cys 
440 

323 
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110 

Pro Asp Ala Gly Gly 
125 

Asp Pro Met Asp Thr 
140 

Tyr Leu Asp Leu Leu 
160 

Pro Asp Ser Val Val 
175 

Cys Leu Pro Val Pro 
190 

Arg Gly Val Ala Arg 
205 

Phe Val Glu Ala Val 
220 

Thr Gly Val Tyr Ala 
240 

Lys He Gly Pro Asp 
255 

Val He Pro Pro Ala 
270 

Ala Asp Pro Arg Arg 
285 

Ala Pro Arg Glu He 
300 

Ser Phe Phe Glu Lys 
320 

Arg Arg Glu Thr Leu 
335 

Asp Ala Gly Glu Thr 
350 

He Val Ala Cys He 
365 

Gin Ala Val Ser Val 
380 

Leu Gin Leu He Pro 
400 

Leu Arg Phe Lys His 
415 

Leu Ala Leu Ser Val 
430 

Gin Gin Cys Phe Ala 
445 
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Thr Lys Cys Asp Asn Asn Arg Leu His Thr Leu Phe Thr Val Asp Ala 

450 455 460 

Gly Thr Pro Cys Ser Arg Ser Ala Pro Ser Ser Thr Ser Arg Pro Ser 
465 470 475 480 

5 Ser Ser 



(2) INFORMATION FOR SEQ ID NO: 104: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10212 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 





GAAAGTGGAG 


GGAAGGGGGG 


GAGGAGTGGG 


AGTGGTTAAG 


GAGGTGGGAG 


GAGAGAGAGA 


60 


20 


AGGGAGGGGA 


GGTGGTGGAG 


AGTGTGGAGT 


AAGGAAGGAG 


AAAGGAGGAG 


AAGGGAGGGA 


120 




GTTTGATTGT 


AGAGGAAAAG 


AGGTGGGAGA 


GGAGAGGGGG 


ATGATGGATG 


GAGGGTATGG 


180 




GAGTGGAGGG 


GGGGGGTGGA 


GGTGGAGTGG 


GGGTGGGAGT 


TAGGGGGTGG 


GGGGGTGGGG 


240 




GNTGTGGTGG 


GTTGGTGGTG 


GGGGGGTGGT 


GTTGGGTGGG 


NTGGGGTGGG 


TGGGTGGGGG 


300 








TTTTGTTNTT 


NNTTTTNNNT 


NNNNNTNNTN 


NNNNNTTTTT 


360 


25 


TGGCGCCGGA 


CCTCACGGAC 


CCGCTGCTGT 


TTGCGTACGT 


CGGATTCCAG 


GTCGTGAACC 


420 




ACGGGCTGAT 


GTTTGTGGTC 


CCCGACATCG 


CCGTATACGC 


GATGCTGGGG 


GGCGCCGTGT 


480 




GGATCTCGCT 


GACGCAGGTG 


CTTGGGCTCC 


GGCGCCGCCT 


TCACAAGGAC 


CCAGACGCCG 


540 




GGCCCTGGGC 'GGCCGCGACC 


CTGCGGGGCC 


TCTTTTTCTC 


CGTCTACGCA 


TTGGGGTTTG 


600 




CGGCGGGGGT 


GCTGGTGCGG 


CCGCGGATGG 


CGGCGAGCCG 


GCGGTCGGGG 


TGATCGCCAT 


660 


30 


TTCAAATAAA 


AGGCACGAGT 


TCCCCGAATA 


CCACCGGCGT 


GTGATGATTT 


CGCCCTACCG 


720 




CTCCGATCCC 


CGGGGGGAGG 


GGGGAAGGAA 


ATGGGGGCGG 


GGGTGCCGTG 


GACGGGTATA 


780 




AAGGCCAGGG 


GGGCAGGCGG 


GCCCATCACT 


GTTAGGGTGT 


TAGGTTGGGA 


GGTGGCACAA 


840 




AAAGCGACAC 


ACCCGTGTTG 


TAGTTGTCCG 


CGGGAGGCGG 


TGGTTTCCGG 


CAACCCTCCT 


900 




CGCTGCGCCG 


GGCGCGCCCA 


CCGGTCCTTC 


GCGGGGGCCG 


GGGCTCTTCT 


GGTCATGGCC 


960 


35 


CTTGGACGGG 


TGGGCCTAGC 


CGTGGGCCTG 


TGGGGCCTGC 


TGTGGGTGGG 


TGTGGTCGTG 


1020 




GTGCTGGCCA ATGCCTCCCC 


CGGACGCACG 


ATAACGGTGG 


GCCCGCGGGG 


GAAAGAGAGC 


1080 




AATGCCGCCC 


CCTCCGCGTC 


CCCGCGGAAC 


GCATCCGCCC 


CCCGAACCAC 


ACCCACGCCC 


1140 




CCCCAACCCC 


GCAAGGCGAC 


AAAAAGTAAG 


GCCTCCACCG 


CCAAACCGGC 


CCCGCCCCCC 


1200 




AAGACCGGGC 


CCCCGAAGAC 


ATCCTCGGAG 


CCCGTGCGAT 


GCAACCGCCA 


CAACCCGCTG 


1260 


40 


GCCCGGTACG 


GCTTGCGGGT 


GCAAATCCGA 


TGCCGGTTTC 


CCAACTCCAC 


CCGCACGGAG 


1320 




TCCCGCCTCC 


AGATCTGGCG 


TTATGCCACG 


GCGACGGACG 


CCGAGATCGG 


AACGGCGCCT 


1380 




AGCTTAGAGG 


AGGTGATGGT 


AAACGTGTCG 


GCCCCGCCCG 


GGGGCCAACT 


GGTGTATGAC 


1440 




AGCGCCCCCA 


ACCGAACGGA 


CCCGCACGTG 


ATCTGGGCGG 


AGGGCGCCGG 


CCCGGGCGCC 


1500 
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AGCCCGCGGC TGTACTCGGT CGTCGGGCCG 
CTGACCCTGG AGACCCAGGG CATGTACTAC 
GCGTACGGGA CCTGGGTGCG CGTTCGCGTG 
CACGCGGTGC TGGAGGGCCA GCCGTTTAAG 
5 GGCAACCGCG CGGAGTTCGT CTGGTTCGAG 
ATACACACGC AGACGCAGGA GAACCCCGAC 
GCGGCCGTCG GCGGCCAGGG CCCCCCGCGC 
GACTCCGTGT CGTTCTCTCG GCGCAACGCC 
ACCATTACCA TGGAGTTTAC GGGCGACCAT 

10 GGGGTGACGT TTGCCTGGTT CCTGGGGGAC 
GCGTCCCAGA CATCGTGCGG GCGCCCCGGC 
TCGTACGAGC AGACCGAGTA CATCTGCCGG 
CTAGAGCACC ACGGCAGCCA CCAGCCCCCG 
CGGGCGGTGG AGGGGGCGGG GATCGGAGTG 

15 ACCGCGGTAG TGTACCTCAC CCACGCCTCC 
CGGGGCCGGG CCCGGCCGCC GGTTGTCTTC 
CACACCCCAC CCCACCCCCC CGCCGTCCCC 
CCCACCGGAA AATCCTCGGC CCGATCCGAA 
CCGCGAAGAG AGCGCCCCGC CCCGATATTC 

20 CTTCGGGACC CGCCTACCAG CCCCTCGCCC 
CGGCCGTGGC CTGGATCGGC GTCGGAGCGA 
TGGTTCTCGT ACCCCCTCGG TCCTCGTGGG 
AATTCAACGC GGGATGCGTC GCGTGGGACC 
GCGGCTGCAG CGCGCCGGCC ACCCTTATCC 

25 TGACACGCGT CCAGGCGGAG AGATCGTCGG 
GGACCTGTCT GAGACTCGTC GACAGCGTCA 
CGATCCGCAT ATGCTACTAC CCACGAAGCC 
TACGTAACGC CCTGGGGTTG CCGTGAGGCG 
TTCTTCCCCC TCCCCACCCC ACCCACCGAC 

30 TTTTCTTTTT CTCTTCCCCC CCCAAAAAAA 
AACCATGCGG AACTCGCTGT TTTTTTTCCC 
TACGGGGAAA GGGGCCGGAA ACCGAGACGG 
TCTGGTGTCG GCCGCGTTTG AGCTTCGTCA 
GGCCGTCGTT GGCCAGCGCG TTGGTCCGGG 

35 CCGGGTCCGG GGCGCGTGTG GCCCCCGGAG 
ACAGGGGATT TTCCGCCTCG ATGTACGGGG 
GGCCGGCGTC TTGCCGGCGA AGGCAGATGT 
CGTAGACGCG CCCCCCATCC TCGCTCACCG 
GGGCGCGGGG GGCGTGGCTT TCGGCCGGCC 

40 , TGGAGCCAAG CCCGACGCCC GCGGGCATGG 
TTTCCATCTG GTCCGGTTCG GCCTCCGCGT 
CCCCGGCGGC CTCGCGATGA GCCGCGAGCG 
GGGGCGCGGG GTTGCGGGGC GGGAGGCGCG 
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CTGGGTCGGC 


AGCGGCTCAT 


CATCGAAGAG 


1560 


TGGGTGTGGG 


GCCGGACGGA 


CCGCCCGTCC 


1620 


TTCCGCCCTC 


CGTCGCTGAC 


CATCCACCCC 


1680 


GCGACGTGCA 


CGGCCGCCAC 


CTACTACCCG 


1740 


GACGGTCGCC 


GGGTATTCGA 


TCCGGCCCAG 


1800 


GGCTTTTCCA 


CCGTCTCCAC 


CGTGACCTCC 


1860 


ACCTTCACCT 


GCCAGCTGAC 


GTGGCACCGC 


1920 


AGCGGCACGG 


CATCGGTGCT 


GCCGCGGCCA 


1980 


GCGGTCTGCA 


CGGCCGGCTG 


TGTGCCCGAG 


.2040 


GACTCCTCGC 


CGGCGGAGAA 


GGTGGCCGTC 


2100 


ACCGCCACGA 


TCCGCTCCAC 


CCTGCCGGTC 


2160 


CTGGCGGGAT 


ACCCGGACGG 


AATTCCGGTC 


2220 


CCGCGGGACC 


CCACCGAGCG 


GCAGGTGATC 


2280 


GCTGTCCTTG 


TCGCGGTGGT 


TCTGGCCGGG 


2340 


TCGGTGCGCT 


ATCGTCGGCT 


GCGGTAACTC 


2400 


TTTTCCACCC 


CTTCCGTCCC 


CCGTACCCAC 


2460 


CGGGCGTTAT 


AAGCCGCCGC 


ACTCGCTTTT 


2520 


CGGCGCACGC 


CGCGTGGGCT 


CCAAACGCCT 


2580 


AAGCCCGCGG 


TGGTGCTATG 


GCTTTCCGTG 


2640 


CCGCGGCCTC 


CCCGGCGCGG 


GCTCGTGTTC 


2700 


TCGTCGGGGC 


CTTTGCGCTC 


GTCGCCGCGT 


2760 


GACTCTCGCC 


GTGCGACAGC 


GGCTGGCAGG 


2820 


CCACCCCCGT 


CGAGCACGAG 


CAGGCGGTCG 


2880 


CCCGTGCGGC 


CGCCAAGCAC 


CTGGCCGCTC 


2940 


GTTACTGGTG 


GGTGAACGGA 


GACGGCATCC 


3000 


GTGGCATCGA 


CGAGTTTTGC 


GAGGAGCTCG 


3060 


CCGGCGGGTT 


TGTCCGCTTC 


GTAACTTCAA 


3120 


CGCGTCCGAC 


GGTCCCGCTT 


CTCGCCTCTC 


3180 


CAACGACGGC 


GTTTGGCCAA 


TACCCTCCTT 


3240 


AAAACAATAA 


ACAGCTAATT 


GCGTACGACA 


3300 


CTGTTTGTTA 


CTTTTTATTG 


AAAACAGACA 


3360 


TGGGGCCGGC 


GGTCGCATTT 


TTTTAATGGC 


3420 


ACAGGGCGCT 


GAGGGCGGCG 


ACGTTTGTCG 


3480 


GGCGGGCGGG 


CATGGGCGAC 


AGGCTTAGTC 


3540 


GGGAGAAGAG 


GGCAGACCCG 


CCCCAGTCGT 


3600 


AGTCCGGGGC 


GTCTCCCGGC 


GGGGCCGCCC 


3660 


TTTCGTATAC 


CCGAACCCAG 


GGGATCTCCT 


3720 


ACTCGTAAAT 


GGAATCTGCC 


TCCTCGGAGG 


3780 


AGGCGGCGGC 


GGTGGTGTCG 


GCGGCGGGGG 


3840 


CGGCGTCATC 


CTCCGGCAGC 


AGATACGTGT 


3900 


CTGGCCCCCA 


GGTCCGCACT 


GCGTTGTAAA 


3960 


GGCGCGCCGC 


GGCTGCCGGC 


CGCTGCTCGG 


4020 


GGGGCGCCCC 


GGCCATATGC 


GTGTAATAAG 


4080 
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TGGCCGGCCG 


GCCGGCGCAG 


GGCTCGGGAC 


CCCGGTCGGC 


CGCGTCAACG 


TGCGGGGGCT 


4140 


CGGGGAGGTC 


CTCGCGGTGG 


CGCCTGAACC 


TCCGAGGGGC 


CGCGGGGGTC 


AATTGGGGGC 


4200 


PAPPPCGGGG 


GAGCGGCGGG 


GGTGCGTTAT 


CGCGCCGGGT 


CCGTTGTATC 


TTGTCCCGGC 


4260 


AGPTPPCGCC 


GACCGCGCCG 


CGGCCCCCCG 


GTGGGCCGGA 


CGCCGCGAGG 


CGCAGGATGG 


4320 


APTPGTAGTG 


GGGCGACGGG 


GTTCCGCTCC 


GAAGCAGGTC 


CGGGGCCAGG 


GCGGCCCCTA 


4380 


APPAPTAPTT 


GATGCTGAGT 


TCCATCCGGG 


CCCAGCTCGG 


GGCGGTCATC 


GTGGGGAACA 


4440 


GGGGGGCGGC 


GGTCCTGCAG 


AAGCGCTCCT 


GGCTGTCCAC 


CGCCGCCCGT 


AGGTACTCGT 


4500 


TGTTCAGGCT 


GTCGGAGGCC 


CAAACAACAT 


ACCCGGTAAG 


CGTCGCGTTA 


ATTATATACT 


4560 


GGGCGTGGTG 


GTCGACTATG 


GATAGAACCT 


CGACGGTCGA 


GACGATGGCG 


TCCACGATCC 


4620 


CGTACGTGCC 


GCCGCTGCGC 


TTGCCGGTCT 


CCCACAGGTG 


GGCCAGGCGC 


GTCAGGTGGC 


4680 


CCAGGACGTC 


GCTGACCGCC 


GCCCGCAGGG 


CCATGCACTG 


CATCGAGCCC 


GTGGTGCCGC 


4740 


TGGGCCCGCG 


GTCCAGGTGG 


CGCGCAAACG 


TCTCCGCGGG 


CGCCTCCAGA 


CTCCCGCTGA 


4800 


GCGCCACGAA 


CCGGCGATCG 


GCGGGGCCCA 


GGCGGCGACA 


CACGTACTTG 


TCCGCCGTCC 


4860 


ACAGCATCCA 


CGAGGCCCAA 


TGGTACAACA 


CGGAGACGTA 


GGCCAGGAGC 


TCGCTCAGCC 


4920 


GPAGTGCGGT 


GTCCGTGCTC 


GGCCGGCTCG 


GGTCTGCGGG 


GCGCATAAAA 


AACATGTACT 


4980 


GPTGGAGCCT 


GTGGGCCGCG 


TCGCGCAACC 


CCGCCACCGC 


GGCGGCGTAC 


TTGGCCGCGG 


5040 


CGGCCCCGCT 


CTTGAACGGG 


GCGCGCACCA 


CCAGCTTCGG 


GAGCAGGGTG 


GGCCGCATCA 


5100 


ACACGTGCAG 


GCTGGGGTCG 


CANTCGCCCG 


CCGGGTCGTC 


GGGGATGTCC 


AGGCCGCTGG 


5160 


GCACAACCGT 


CTGGAGGTAC 


TTCCAGTACT 


GCGCTAGGAT 


GGCGCGGCTC 


AGCTGGCCGC 


5220 


CCGACAGCTC 


CACCTCGCCG 


AGCGCCTGCT 


TGGCGGCCGA 


CGCGTAGTGC 


CGGATGTAGT 


5280 


CGTAGTGCGG 


GTCGCTGGCG 


AGCCCGTCTA 


CGATCAGGCT 


CTCGGGGACG 


GTGTTATGGT 


5340 


GCCGCGCCGC 


CAGCCGGACG 


CTGCGATCGG 


CGCCGGTCAG 


AAACGCCGGC 


TGCAGGTCGT 


5400 


CGGCGCGCTG 


CCGCAGGACG 


CCCACGGCCG 


CGCTGAGGAG 


CCCCTCCGGG 


GTGGGGAGCA 


5460 


GACACCCGGC 


GAAGATGCGC 


CGCTCGGGGA 


CGCCCGCGTT 


GGCGCCGCGG 


ATGAGGTTGG 


5520 


CCGGCGTCAG 


GCACCGCGCC 


AGCCGCAGGG 


AGCTCGCGCC 


GCGCGCCCGG 


CGTTGCATGG 


5580 


CGGAGACCGT 


TCGGTCGGGG 


GCCCGCCGGT 


CGGAGGTATG 


CCGCGTCCCG 


GGATATAGGG 


5640 


TTGCTTTTTA 


TGGGGAGGCG 


CCTATGGGCG 


TGGCGGGCCG 


CCCAGCCCGG 


TCGCGCGCCT 


5700 


CCCGGACACG 


TGCGCCCGGA 


GGGCGGCGGT 


CTCCTCGTCG 


CCCATGAGCA 


GTTTCCGAAA 


5760 


CTGCGCCATG 


ATGTCCACGA 


CGCGGACCCG 


CGGCCCCAGC 


ACGGACTCGC 


TATTCAGGGG 


5820 


GGCGGGGGGG 


AAGGCCGCCA 


GGTCTTCGAG 


CAGGAAGGCG 


GGGTCTGCCG 


TCCCGCTCAC 


5880 


GGGCGCCCGG 


GGCGCCGAGG 


ACGCGGGGCG 


AAGGTCCACG 


TGTTCCGCGG 


CGGCGCGCAC 


5940 


GTCCGCCCAA 


AATTTGGCGG 


GGGTGGTCCG 


CGCGTACAGG 


GGCTGGGTCG 


CGCGGAGGAC 


6000 


GCACGCGTAG 


CGCAGGGGGG 


TGTACGTGCC 


CACCTCGGGG 


GCCGTCGACC 


CGCCGTCAAA 


6060 


CGCGGCCAGG 


GCCACGCACG 


CGACCACCGT 


GTCGGCCAGG 


CCCAGCAGCC 


GCTGCAGGAT 


6120 


GAGCCCCGTC 


GCCAGCACGG 


CGCGCGCGGC 


CGCCGCGTTG 


TCCCTGCGCC 


GGCGCGCGTC 


6180 


\_ l« \3 C /VJXJV- ^- 










VjCoIAj 1 At- AL. 


OZ4U 


GGCCGCGCCC 


AGCACGGCGT 


TCAGCCCGCT 


GGTGGCGAGC 


AGGCGGCGCG 


CCGCGGTGTC 


6300 


GCCCAGCGCC 


TCGTGCTCGG 


CCGCCACGAC 


CCCGGGGCTA 


CCCAGGGGCA 


GGGCGCGAAA 


6360 


CAGCGCCTCC 


TGCTCCACGT 


CCGCAAACGC 


GGGGTGGGCG 


GAGTGCGGGT 


GCAGGCGCGC 


6420 


CCCCACGACC 


ACCGAGAGCC 


ACTGGACCGT 


CTGCTCCGCC 


AGGACCGCCA 


GCACGTCCAG 


6480 


GACGCGCCCC 


GCAAACGCGG 


CCTCCCGCGG 


GAGCACGCAT 


TTGACGGCGC 


CGGGGTTGAA 


6540 


GCGGGCGAGC 


AGAGCCCCGG 


TGGCGATGTA 


CGTCATGCGC 


CCCGCGTAGC 


GGGCGGCCAC 


6600 


GCGACAGTCG 


CGCCCCAGGA 


GCGCGCGCAC 


CCCGGGCCAG 


TACAGCAGGG 


ACCCCAGCGA 


6660 
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ACTGCGAAAG 


ACCGCGGCGT 


CGGGGCCGGG 


GTGGGGGGGC 


GCGGCCCCTC 


CCGCGCTGAG 


6720 


CAGCGGCACG 


GCGGCGGCCC 


CCACGGGCCG 


CAACGCCGTG 


AGGCTCGCGA 


ACTGCCGTCG 


6780 


GAGCTCGGCC 


GCCCTGTCGT 


CGAGCTCCGA 


GCCGCGCCCC 


TCCGTGTGCA 


GGCGCGTCCC 


6840 


GCAGACCCAC 


CCGTTGATCG 


CCACCCGCAC 


GATGGCGTCC 


ACCAGAAAGC 


CCATCGCGCG 


6900 


GGAGGGGCTG 


GTTTTTGCCC 


GCCGATCCGT 


CAGGTCGAGG 


ATCGCGTCGC 


CCGTGACGTA 


6960 


CCAGGCCAGC 


GCCTCGCCCT 


GCTGCAGCGT 


CTGGCGGAAA 


AACACCTTTG 


GGTCGGCCGG 


7020 


GGAGGCAAAG 


TGCATGACCC 


CCACGCGCGA 


CAGCCCGAAC 


GCGCTATCCG 


GACACGGGTA 


7080 


GAACCCGGCC 


GGATGTCCCA 


GGGCCAGGGC 


CGAGCGCACG 


GACTCGTCCC 


ACGCGGCGAC 


7140 


TCGGGGGGTC 


AGGCGGTCCA 


GGGGG AATGC 


CGCCTGCAGC 


TCCGGGCCCG 


ACACGCGGCC 


7200 


CTCTATAATC 


TCGACCGTCG 


CGGGAGGCCG 


CGCCCCGGCG 


CCGTCATCGT 


GCGCGACGGC 


7260 


GGCGGGGTAG 


TCGTCCTCCT 


CGTAGCTGAG 


CTCGTCCAGG 


AACAGCGGCG 


AGGGCACCAC 


7320 


CCGCGAACCG 


CCCACCCGCC 


CCAAAACGTC 


GCGTGGGTCC 


ATCGGGCCCA 


GGTAGCCTCC 


7380 


CCGCGGGGCC 


CGCGTGATGG 


CGCTGTCCCG 


GCGTCCGCGA 


ACGGACTGGC 


TCCTGGCCGT 


7440 


AACGGACCTG 


GGGCGCGGAA 


AGGACGCCCG 


GCGGGGGGGC 


GCCGCCGCCC 


GGGCCTCGGA 


7500 


CGCGCGTCGG 


GACCCGGGGT 


GACCGCGGGC 


CTCCCGGCGA 


CGGCGCGGGG 


GCGGCTCTTC 


7560 


GCTCGCCATC 


TCCCCCGCGG 


CCTCGACCTC 


GCTGTCGTCG 


TCCACGTTAA 


ACACCGCCCG 


7620 


CAGGTACCCC 


ATTAACCCGA 


CTCCACCGCC 


CTCGGGCTCG 


TCCTCCACGG 


GCGATTCGGC 


7680 


GCGATGCGCG 


GACGGGGCAT 


GGGACCGGGT 


GGAGGCGCGC 


CTCCGGCGTA 


CGGCATGCCC 


7740 


GCGCACGGAC 


ATGGTGGCCG 


GAGGCCCGAT 


TTTTTACACA 


CCCCCTCCCC 


GCAAACGGAC 


7800 


AAGGAAAGGG 


GTGGTGCGAG 


GGGGGAGGCC 


CAAACGGGGA 


GGTGGGGGGT 


AGGGGGCGGT 


7860 


CCCAGGGAGC 


GGGGGGTACG 


AACCGGCACG 


ACGGGAAC AG 


AGAAACGCGA 


CCGCTCCAAC 


7920 


AAGGGTGGGG 


GGTGGGCCTC 


ATCCCCACGC 


AAACCCGCGG 


GCAAATGCGA 


GAACGGGACC 


7980 


CGCGCGCCTG 


CCTTTATACG 


CGGACCCCAG 


CACCACGAGC 


CGTTCTGTGA CCCGAATCTA 


8040 


CACGACCGCG 


GGCTCGTAGG 


CGCGACTAAC 


GCCCAACCCA 


ACGGCACACA 


CCCCCCACCC 


8100 


CGCGCGTAAC 


CCCATTTCTT 


TCATGGTCCC 


GTAATAAACA 


GCCAACGCAC 


GCCGCGTATG 


8160 


ATGAGTTGCT 


TGCCAATGTT 


TATTGCTGTG 


GTTGCGAACC 


CTCTATCGCG 


ATACAGACGG 


8220 


AGGTGAGGCG 


GGGCGGTGGT 


GGGGGGGGGG 


GCGCCCCCCC 


CGGTCGCACA 


TCCTACCCCC 


8280 


CAAAGTCGTC 


AATGCCCATG 


GCATCGGTAA 


ACATCTGTTC 


AAACTCAAAA TCGTCCACGT 


8340 


CCAAAGCCCC 


ATACAAAACG 


GGGTCGTGGG 


TCATTCCCGG 


GGAGGGGGAC 


TCCACGTCCC 


8400 


CCAGCATCTC 


CAAGTCGAAG 


TCGTCCAGGG 


CGTCGGCGGG 


CGTCATATCC 


ACCTCCTCGC 


8460 


CGTCCAGGCG 


GAGTTCGTCT 


CCCAGGCTGA 


CGTCGGTAAT 


GGGGGCGGTG 


GTGGACAGTC 


8520 


TGCGGGGGCG 


TTGTCCCGCG 


GAGAGAAACG 


ACATGCGCGG 


CGCCACCAGC 


CCGGCCTCCG 


8580 


CGGGAGCGTC 


ATCGTCGTCC 


GGGAGGTCGA 


GCAGGCCCTC 


GATTGTCGAT 


CCGTAATTAT 


8640 


TTCTGGTCCG 


CCCGCGGCTA 


TACGCGTGCT 


CCCGCATGAC 


GGACTCGCCC 


TCCGAGGTCG 


8700 


CAACGCTGGA 


GTACGAGTCC 


AACTTGGCCC 


GGATCAGCAG 


CATAAAGTAC 


CCAGAGGAGC 


8760 


GGGCCTGGTT 


GCCCTGCAGG 


ACGGGCGGGG 


TCGTGAGGGG 


CGCCCCGGGT 


TCCTCCGCCG 


8820 


CCGCACTTCG 


CACCAGCGGG 


AGGTTCAGGT 


GCTCGCGAAT 


GTGGTTTAGC 


TCCCGCAGTC 


8880 


GCCGGGCCTC 


CACGGGAACT 


CCCCGCACGG 


TGAGCGATCC 


GTTGATAAAC 


ATCAGGGGCT 


8940 


GAAACAGACA 


CGCCAACTGG 


CGCCAGCTCT 


CCAGGTCGCA 


GCAGAGGCCG 


TCGAACAGAT 


9000 


CGGGCCGCAT 


CATCTGCTCG 


GCGTACGCGG 


CCCATAGGAT 


CTCGCGGCTC 


AGAAAGAGGT 


9060 


ATAGATGCAG 


AAACAGGACG 


CGCGCCAGGC 


GCGCGGTCTC 


GCGGTAGTAC 


CTGTCCGCGA 


9120 


TCGTGGTGCG 


CAGCATCTCC 


CGCAGGTCGC 


GGTTGCGGCC 


CCGCATGTGT 


GCCTGGCGGT 


9180 


GTAGCTGCCG 


AACGCTGGCG 


CGCAGGTACC 


GGTACAGGGC 


CGAGCAAAAA 


TTTGCCAACA 


9240 
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15 



20 



CGGTCCGGTA 


GCTCTCCTCC 


CGCGCCCGCA 


GCTCACCGCG 


GAAAAACTGC 


GCCATGGCCT 


9300 


CGTAGTACGA 


AGGCAGCTCG 


TCGCGGGTGG 


CGGGCAGGGT 


GGGGAACGCC 


ACGTCGCCGT 


9360 


GGGCGCGAAT 


GTCGATCGGG 


GAGCGCTCGG 


GGACGTGCGC 


ATCCCCCCAG 


TCGATCACGT 


9420 


CGCTGGGCAG 


CGTCGACAGA 


AACTTGCACT 


CCCGGTACAT 


GTCGGCGTTG 


GTCGGGAACC 


9480 


CAGAGAACAG 


GTCCTCGTTC 


CAGGTATCTA 


GCATGGTACA 


CAGCGCGGGA 


CCCGCGCTGA 


9540 


AGCCCAGATC 


GTCGAGGAGA 


CGGTTAAACA 


GGGCCGCGGG 


GGGGACGGGC 


ATGGGCGGCG 


9600 


AGGGCATCAG 


CTGGGCCTGA 


CTCAGCCGAC 


CGGTGGCGTA 


CAGCGGAGGG 


GCGGCTGGGG 


9660 


TGTTCTTGGG 


ACCCCCGGCT 


GGCCTGGGGG 


GCGGTGGCGA 


AACCCCGTCC 


GCGTCCGCAA 


9720 


ACAGATTGTT 


GACCAACAGG 


TCCATGGGGG 


CGGTTGGGTC 


CGGGGATAAC 


GATTTTGAGA 


9780 


GGCGAATGAG 


AAGTGCCCGA 


GCGCCCGGCG 


GCGGAGAGGG GGGGAGGGAT CCGGGACCCG 


9840 


CGACAGAAAA 


AGGCCGGGGC 


CCTTGCGAAG 


GGAATTGCCG 


GGGGTGCCGT 


GCGTCCCCGA 


9900 


TGACTGACAT 


CTCTCTTCCT 


CCCCCCCGCA 


TTTTTAGTAT 


CACCCCAATT 


GCCGCCCCAA 


9960 


AACCTTCTTG 


ACTTCCCCCA 


CCCGTTTCCG 


TGGCGGCCCC 


TTCCCCCCTG 


CTCCTCTGTA 


10020 


ACGGGATGGT 


CTTATTCCCT 


CCTTCCCCTG 


GCCCTTCCCC 


CTCCTCTCTT 


CCTTTTTCCT 


10080 


TCCCCTTCTT 


CCGTCACTCC 


TTCCTCCCCT 


CTCTCGATTC 


CTCCCTTCTT 


CCCCATCTCT 


10140 


TCCTTCTCCT 


CTCACTCTCA 


TATCCTTCAA 


TACTCTCCTC 


CTCTCTATCT 


TTCCCCCCGC 


10200 


TTCTTCTCTC 


T 










10212 



(2) INFORMATION FOR SEQ ID NO: 105: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

30 

Val Gly Val Gly Val Arg Gly Trp Gly Gly Gly Xaa Cys Gly Gly Leu 

1 5 10 15 

Val Val Gly Gly Trp Cys trp Val Xaa Trp Gly Gly Trp Val Gly Val 
20 25 30 

35 Phe Phe Cys Phe Phe Leu Phe Cys Xaa Xaa Phe Xaa Xaa Xaa Xaa Xaa 
35 40 45 

Xaa Xaa Phe Leu Ala Pro Asp Leu Thr Asp Pro Leu Leu Phe Ala Tyr 

50 55 60 

Val Gly Phe Gin Val Val Asn His Gly Leu Met Phe Val Val Pro Asp 
40 65 70 75 80 

lie Ala Val Tyr Ala Met Leu Gly Gly Ala Val Trp He Ser Leu Thr 

85 90 95 

Gin Val Leu Gly Leu Arg Arg Arg Leu His Lys Asp Pro Asp Ala Gly 
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100 105 110 

Pro Trp Ala Ala Ala Thr Leu Arg Gly Leu Phe Phe Ser Val Tyr Ala 

115 120 125 

Leu Gly Phe Ala Ala Gly Val Leu Val Arg Pro Arg Met Ala Ala Ser 
5 130 135 140 

Arg Arg Ser Gly 
145 



10 



(2) INFORMATION FOR SEQ ID NO: 106: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

20 

Met Gly Ala Gly Val Pro Trp Thr Gly He Lys Arg Ala Gly Gly Pro 

15 10 15 

He Thr Val Arg Val Leu Gly Trp Glu Val Ala Gin Lys Ala Thr His 
20 25 30 

25 Pro Cys Cys Ser Cys Pro Arg Glu Ala Val Val Ser Gly Asn Pro Pro 
35 40 45 

Arg Cys Ala Gly Arg Ala His Arg Ser Phe Ala Gly Ala Gly Ala Leu 

50 ' 55 60 

Leu Val Met Ala Leu Gly Arg Val Gly Leu Ala Val Gly Leu Trp Gly 
30 65 70 75 80 

Leu Leu Trp Val Gly Val Val Val Val Leu Ala Asn Asp Gly Arg Thr 

85 90 95 

He Thr Val Gly Pro Arg Gly Lys Glu Ser Asn Ala Ala Pro Ser Asp 
100 105 110 

35 Arg Asn Ala Ser Ala Pro Arg Thr Thr Pro Thr Pro Pro Gin Pro Arg 
115 120 125 

Lys Ala Thr Lys Ser Lys Ala Ser Thr Ala Lys Pro Ala Pro Pro Pro 

130 135 140 

Lys Thr Gly Pro Pro Lys Thr Ser Ser Glu Pro Val Arg Cys Asn Arg 
40 145 150 155 160 

His Asn Pro Leu Ala Arg Tyr Gly Leu Arg Val Gin He Arg Cys Arg 

165 170 175 

Phe Pro Asn Ser Thr Arg Thr Glu Ser Arg Leu Gin He Trp Arg Tyr 
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180 185 190 

Ala Thr Ala Thr Asp Ala Glu lie Gly Thr Ala Pro Ser Leu Glu Glu 

195 200 205 

Val Met Val Asn Val Ser Ala Pro Pro Gly Gly Gin Leu Val Tyr Asp 
5 210 215 220 

Ser Ala Pro Asn Arg Thr Asp Pro His Val He Trp Ala Glu Gly Ala 
225 230 235 240 

Gly Pro Gly Asp Arg Lys Val Val Gly Pro Leu Gly Arg Gin Arg Leu 
245 250 255 

10 He He Glu Glu Leu Thr Leu Glu Thr Gin Gly Met Tyr Tyr Trp Val 
260 265 270 

Trp Gly Arg Thr Asp Arg Pro Ser Ala Tyr Gly Thr Trp Val Arg Val 

275 280 285 

Arg Val Phe Arg Pro Pro Ser Leu Thr He His Pro His Ala Val Leu 
15 290 295 300 

Glu Gly Gin Pro Phe Lys Ala Thr Cys Thr Ala Ala Thr Tyr Tyr Pro 
305 310 315 320 

Gly Asn Arg Ala Glu Phe Val Trp Phe Glu Asp Gly Arg Arg Val Phe 
325 330 335 

20 Asp Pro Ala Gin He His Thr Gin Thr Gin Glu Asn Pro Asp Gly Phe 
340 345 350 

Ser Thr Val Ser Thr Val Thr Ser Ala Ala Val Gly Gly Gin Gly Pro 

* 355 360 365 

Pro Arg Thr Phe Thr Cys Gin Leu Thr Trp His Arg Asp Ser Val Ser 
25 370 375 380 

Phe Ser Arg Arg Asn Ala Ser Gly Thr Ala Ser Val Leu Pro Arg Pro 
385 390 395 400 

Thr He Thr Met Glu Phe Thr Gly Asp His Ala Val Cys Thr Ala Gly 
405 410 415 

30 Cys Val Pro Glu Gly Val Thr Phe Ala Trp Phe Leu Gly Asp Asp Ser 
420 425 430 

Ser Pro Ala Glu Lys Val Ala Val Ala Ser Gin Thr Ser Cys Gly Arg 

435 440 445 

Pro Gly Thr Ala Thr He Arg Ser Thr Leu Pro Val Ser Tyr Glu Gin 
35 450 455 460 

Thr Glu Tyr He Cys Arg Leu Ala Gly Tyr Pro Asp Gly He Pro Val 
465 470 475 480 

Leu Glu His His Gly Ser His Gin Pro Pro Pro Arg Asp Pro Thr Glu 
485 490 495 

40 Arg Gin Val He Arg Ala Val Glu Gly Ala Gly He Gly Val Ala Val 
500 505 510 

Leu Val Ala Val Val Leu Ala Gly Thr Ala Val Val Tyr Leu Thr His 
515 520 525 
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Ala Ser Ser Val Arg Tyr Arg Arg Leu Arg 
530 535 

(2) INFORMATION FOR SEQ ID NO:107: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

15 

Met Ala Phe Arg Ala Ser Gly Pro Ala Tyr Gin Pro Leu Ala Pro Ala 

1 5 10 15 

Asp Ala Arg Ala Arg Val Pro Ala Val Ala Trp lie Gly Val Gly Ala 
20 25 30 

20 He Val Gly Ala Phe Ala Leu Val Ala Ala Leu Val Leu Val Pro Pro 
35 40 45 

Arg Ser Ser Trp Gly Leu Ser Pro Cys Asp Ser Gly Trp Gin Glu Phe 

50 55 60 

Asn Ala Gly Cys Val Ala Trp Asp Pro Thr Pro Val Glu His Glu Gin 
25 65 70 75 80 

Ala Val Gly Gly Cys Ser Ala Pro Ala Thr Leu He Pro Arg Ala Ala 

85 90 95 

Ala Lys His Leu Ala Ala Leu Thr Arg Val Gin Ala Glu Arg Ser Ser 
100 105 110 

30 Gly Tyr Trp Trp Val Asn Gly Asp Gly lie Arg Thr Cys Leu Arg Leu 
115 120 125 

Val Asp Ser Val Ser Gly He Asp Glu Phe Cys Glu Glu Leu Ala He 

130 135 140 

Arg lie Cys Tyr Tyr Pro Arg Ser Pro- Gly Gly Phe Val Arg Phe Val 
35 145 150 155 160 

Thr Ser He Arg Asn Ala Leu Gly Leu Pro 
165 . 170 



(2) INFORMATION FOR SEQ ID NO:108: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Met Ala Gly Ala Pro Pro Arg Leu Pro Pro Arg Asn Pro Ala Pro Pro 
15 10 15 

10 Glu Gin Arg Pro Ala Ala Ala Ala Arg Pro Leu Ala Ala His Arg Glu 
20 25 30 

Ala Ala Gly Val Tyr Asn Ala Val Arg Thr Trp Gly Pro Asp Ala Glu 

35 40 45 

Ala Glu Pro Asp Gin Met Glu Asn Thr Tyr Leu Leu Pro Glu Asp Asp 
15 50 55 60 

Ala Ala Met Pro Ala Gly Val Gly Leu Gly Ser Thr Pro Ala Ala Asp 
65 70 75 80 

Thr Thr Ala Ala Ala Trp Pro Ala Glu Ser His Ala Pro Arg Ala Pro 
85 90 95 

20 Ser Glu Glu Ala Asp Ser lie Tyr Glu Ser Val Ser Glu Asp Gly Gly 
100 105 110 

Arg Val Tyr Glu Glu lie Pro Trp Val Arg Val Tyr Glu Asn He Cys 

115 120 125 

Leu Arg Arg Gin Asp Ala Gly Gly Ala Ala Pro Pro Gly Asp Ala Pro 
25 130 135 140 

Asp Ser Pro Tyr He Glu Ala Glu Asn Pro Leu Tyr Asp Trp Gly Gly 
145 150 155 160 

Ser Ala Leu Phe Ser Pro Pro Gly Ala Thr Arg Ala Pro Asp Pro Gly • 
165 170 175 

30 Leu Ser Leu Ser Pro Met Pro Ala Arg Pro Arg Thr Asn Ala Asn Asp 
180 185 190 

Gly Pro Thr Asn Val Ala Ala Leu Ser Ala Leu Leu Thr Lys Leu Lys 

195 200 205 

Arg Gly Arg His Gin Ser His 
35 210 215 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH : 393 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

5 

Met Gin Arg Arg Arg Ala Ser Ser Leu Arg Leu Ala Arg Cys Leu Thr 

1.5 10 15 

Pro Ala Asn Leu lie Arg Gly Ala Asn Ala Gly Val Pro Glu Arg Arg 
20 25 30 

10 lie Phe Ala Gly Cys Leu Leu Pro Thr Pro Glu Gly Leu Leu Ser Ala 
35 40 45 

Ala Val Gly Val Leu Arg Gin Arg Ala Asp Asp Leu Gin Pro Ala Phe 

50 55 60 

Leu Thr Gly Ala Asp Arg Ser Val Arg Leu Ala Ala Arg His His Asn 
15 65 70 '75 80 

Thr Val Pro Glu Ser Leu lie Val Asp Gly Leu Ala Ser Asp Pro His 

85 90 95 

Tyr Asp Tyr lie Arg His Tyr Ala Ser Ala Ala Lys Gin Ala Leu Gly 
100 105 110 

20 Glu Val Glu Leu Ser Gly Gly Gin Leu Ser Arg Ala lie Leu Ala Gin 
115 120 125 

Tyr Trp Lys Tyr Leu Gin Thr Val Val Pro Ser Gly Leu Asp lie Pro 

130 135 140 

Asp Asp Pro Ala Gly Xaa Cys Asp Pro Ser Leu His Val Leu Met Arg 
25 145 150 155 160 

Pro Thr Leu Leu Pro Lys Leu Val Val Arg Ala Pro Phe Lys Ser Gly 

165 170 175 

Ala Ala Ala Ala Lys Tyr Ala Ala Ala Val Ala Gly Leu Arg Asp Ala 
180 185 190 

30 Ala His Arg Leu Gin Gin Tyr Met Phe Phe Met Arg Pro Ala Asp Pro 
195 200 205 

Ser Arg Pro Ser Thr Asp Thr Ala Leu Arg Leu Ser Glu Leu Leu Ala 

210 215 220 

Tyr Val Ser Val Leu Tyr His Trp Ala Ser Trp Met Leu Trp Thr Ala 
35 225 230 235 240 

Asp Lys Tyr Val Cys Arg Arg Leu Gly Pro Ala Asp Arg Arg Phe Val 

245 250 255 

Ser Gly Ser Leu Glu Ala Pro Ala Glu Thr Phe Ala Arg His Leu Asp 
260 265 270 

40 Arg Gly Pro Ser Gly Thr Thr Gly Ser Met Gin Cys Met Ala Leu Arg 
275 280 285 

Ala Ala Val Ser Asp Val Leu Gly His Leu Thr Arg Leu Ala His Leu 
290 295 300 
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Trp Glu Thr Gly Lys Arg Ser Gly Gly Thr Tyr Gly lie Val. Asp Ala 
305 310 315 320 

He Val Ser Thr Val Glu Val Leu Ser He Val His His His Ala Gin 
325 330 335 

5 Tyr He He Asn Ala Thr Leu Thr Gly Tyr Val Val Trp Ala Ser Asp 
340 345 350 

Ser Leu Asn Asn Glu Tyr Leu Arg Ala Ala Val Asp Ser Gin Glu Arg 

355 360 365 

Phe Cys Arg Thr Ala Ala Pro Leu Phe Pro Thr Met Thr Ala Pro Ser 
10 370 375 380 

Trp Ala Arg Met Glu Leu Ser lie Lys 
385 390 



15 



(2) INFORMATION FOR SEQ ID NO: 110: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 680 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

25 

Met Ser Val Arg Gly His Ala Val Arg Arg Arg Arg Ala Ser Thr Arg 

1 5 10 15 

Ser His Ala Pro Ser Ala His Arg Ala Glu Ser Pro Val Glu Asp Glu 
20 25 30 

30 Pro Glu Gly Gly Gly Val Gly Leu Met Gly Tyr Leu Arg Ala Val Phe 
35 40 45 

Asn Val Asp Asp Asp Ser Glu Val Glu Ala Ala Gly Glu Met Ala Ser 

50 55 60 

Glu Glu Pro Pro Pro Arg Arg Arg Arg Glu Arg His Pro Gly Ser Arg 
35 65 70 75 80 

Arg Ala Ser Glu Ala Arg Ala Ala Ala Pro Pro Arg Arg Ala Ser Phe 

85 90 95 

Pro Arg Pro Arg Ser Val Thr Ala Arg Ser Gin Ser Val Arg Gly Arg 
100 105 110 

40 Arg Asp Ser Ala He Thr Arg Ala Pro Arg Gly Gly Tyr Leu Gly Pro 
115 120 125 

Met Asp Pro Arg Asp Val Leu Gly Arg Val Gly Gly Ser Arg Val Val 
130 135 140 
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Pro Ser Pro Leu Phe Leu Asp Glu Leu Ser Tyr Glu Glu Asp Asp Tyr 
145 150 155 160 

Pro Ala Ala Val Ala His Asp Asp Gly Ala Gly Ala Arg Pro Pro Ala 
165 170 175 

5 Thr Val Glu He He Glu Gly Arg Val Ser Gly Pro Glu Leu Gin Ala 
180 185 190 

Ala Phe Pro Leu Asp Arg Leu Thr Pro Arg Val Ala Ala Trp Asp Glu 

195 200 205 

Ser Val Arg Ser Ala Leu Gly His Pro Ala Gly Phe Tyr Pro Cys Pro 
10 210 215 220 

Asp Ser Ala Phe Gly Leu Ser Arg Val Gly Val Met His Phe Asp Ala 
225 230 235 240 

Asp Pro Lys Val Phe Phe Arg Gin Thr Leu Gin Gin Gly Glu Ala Trp 
245 250 255 

15 Tyr Val Thr Gly Asp Ala He Leu Asp Leu Thr Asp Arg Arg Ala Lys 
260 265 270 

Thr Ser Pro Ser Arg Ala Met Gly Phe Leu Val Asp Ala He Val Arg 

275 280 285 

Val Ala He Asn Gly Trp Val Cys Gly Thr Arg Leu His Thr Glu Gly 
20 290 295 300 

Arg Gly Ser Glu Leu Asp Asp Arg Ala Ala Glu Leu Arg Arg Gin Phe 
305 310 315 320 

Ala Ser Leu Thr Ala Leu Arg Pro Val Gly Ala Ala Ala Val Pro Leu 
325 330 335 

25 Leu Ser Ala Gly Gly Ala Ala Pro Pro His Pro Gly Pro Asp Ala Ala 
340 345 350 

Val Phe Arg Ser Ser Leu Gly Ser Leu Leu Tyr Trp Pro Gly Val Arg 

355 360 365 

Ala Leu Leu Gly Arg Asp Cys Arg Val Ala Ala Arg Tyr Ala Gly Arg 
30 370 375 380 

Met Thr Tyr He Ala Thr Gly Ala Leu Leu Ala Arg Phe Asn Pro Gly 
385 390 395 400 

Ala Val Lys Cys Val Leu Pro Arg Glu Ala Ala Phe Ala Gly Arg Val 
405 410 415 

35 Leu Asp Val Leu Ala Val Leu Ala Glu Gin Thr Val Gin Trp Leu Ser 
420 425 430 

Val Val Val Gly Ala Arg Leu His Pro His Ser Ala His Pro Ala Phe 

435 440 445 

Ala Asp Val Glu Gin Glu Ala Leu Phe Arg Ala Leu Pro Leu Gly Ser 
40 450 455 460 

Pro Gly Val Val Ala Ala Glu His Glu Ala Leu Gly Asp Thr Ala Ala 
465 470 475 480 

Arg Arg Leu Leu Ala Thr Ser Gin Ala Val Leu Gly Ala Ala Val Tyr 
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485 490 495 

Ala Leu His Thr Ala Thr Val Thr Leu Lys Tyr Ala Cys Gly Asp Ala 

500 mm 505 510 

Arg Arg Arg Arg Asp Asn Ala Ala Ala Ala Arg Ala Val Leu Ala Thr 
5 515 520 525 

Gly Leu lie Leu Gin Arg Leu Leu Gly Leu Ala Asp Thr Val Val Ala 

530 535 540 

Cys Val Ala Ala Phe Asp Gly Gly Ser Thr Ala Pro Glu Val Gly Thr 
545 550 555 560 

10 Tyr Thr Pro Leu Arg Tyr Ala Cys Val Leu Arg Ala Thr Gin Pro Leu 

565 570 575 

Tyr Ala Arg Thr Thr Pro Ala Lys Phe Trp Ala Asp Val Arg Ala Ala 

580 585 590 

Ala Glu His Val Asp Leu Arg Pro Ala Ser Ser Ala Pro Arg Ala Pro 
15 595 600 605 

Val Ser Gly Thr Ala Asp Pro Ala Phe Leu Leu Glu Asp Leu Ala Ala 

610 615 620 

Phe Pro Pro Ala Pro Leu Asn Ser Glu Ser Val Leu Gly Pro Arg Val 
625 630 635 640 

20 Arg Val Val Asp lie Met Ala Gin Phe Arg Lys Leu Leu Met Gly Asp 

645 650 655 

Glu Glu Thr Ala Ala Leu Arg Ala His Val Ser Gly Arg Arg Ala Thr 

660 665 670 

Gly Leu Gly Gly Pro Pro Arg Pro 
25 675 680 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 556 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Val lie Leu Lys Met Arg Gly Gly Gly Arg Glu Met Ser Val He Gly 

40 1 5 10 15 

Asp Ala Arg His Pro Arg Gin Phe Pro Ser Gin Gly Pro Arg Pro Phe 

20 25 30 

Ser Val Ala Gly Pro Gly Ser Leu Pro Pro Ser Pro Pro Pro Gly Ala 
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35 40 45 

Arg Ala Leu Leu He Arg Leu Ser Lys Ser Leu Ser Pro Asp Pro Thr 

50 55 60 

Ala Pro Met Asp Leu Leu Val Asn Asn Leu Phe Ala Asp Ala Asp Gly 
5 65 70 75 80 

. Val Ser Pro Pro Pro Pro Arg Pro Ala Gly Gly Pro Lys Asn Thr Pro 
85 90 95 

Ala Ala Pro Pro Leu Tyr Ala Thr Gly Arg Leu Ser Gin Ala Gin Leu 
100 105 110 

10 Met Pro Ser Pro Pro Met Pro Val Pro Pro Ala Ala Leu Phe Asn Arg 
115 120 125 

Leu Leu Asp Asp Leu Gly Phe Ser Ala Gly Pro Ala Leu Cys Thr Met 

130 135 140 

Leu Asp Thr Trp Asn Glu Asp Leu Phe Ser Gly Phe Pro Thr Asn Ala 
15 145 150 155 160 

Asp Met Tyr Arg Glu Cys Lys Phe Leu Ser Thr Leu Pro Ser Asp Val 

165 170 175 

He Asp Trp Gly Asp Ala His Val Pro Glu Arg Ser Pro He Asp He 
180 185 190 

20 Arg Ala His Gly Asp Val Ala Phe Pro Thr Leu Pro Ala Thr Arg Asp 
195 200 205 

Glu Leu Pro Ser Tyr Tyr Glu Ala Met Ala Gin Phe Phe Arg Gly Glu 

210 215 220 

Leu Arg Ala Arg Glu Glu Ser Tyr Arg Thr Val Leu Ala Asn Phe . Cys 
25 225 230 235 240 

Ser Ala Leu Tyr Arg Tyr Leu Arg Ala Ser Val Arg Gin Leu His Arg 

245 250 255 

Gin Ala His Met Arg Gly Arg Asn Arg Asp Leu Arg Glu Met Leu Arg 
260 265 270 

30 Thr Thr He Ala Asp Arg Tyr Tyr Arg Glu Thr Ala Arg Leu Ala Arg 
275 280 285 

Val Leu Phe Leu His Leu Tyr Leu Phe Leu Ser Arg Glu He Leu Trp 

290 295 300 

Ala Ala Tyr Ala Glu Gin Met Met Arg Pro Asp Leu Phe Asp Gly Leu 
35 305 310 315 320 

Cys Cys Asp Leu Glu Ser Trp Arg Gin Leu Ala Cys Leu Phe Gin Pro 

325 330 335 

Leu Met Phe He Asn Gly Ser Leu Thr Val Arg Gly Val Pro Val Glu 
340 345 350 

40 Ala Arg Arg Leu Arg Glu Leu Asn His lie Arg Glu His Leu Asn Leu 
355 360 365 

Pro Leu Val Arg Ser Ala Ala Ala Glu Glu Pro Gly Ala Pro Leu Thr 
370 375 380 
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Thr Pro Pro Val Leu Gin Gly Asn Gin Ala Arg Ser Ser Gly Tyr Phe 
385 390 395 400 

Met Leu Leu lie Arg Ala Lys Leu Asp Ser Tyr Ser Ser Val Ala Thr 
405 410 415 

5 Ser Glu Gly Glu Ser Val Met Arg Glu His Ala Tyr Ser Arg Gly Arg 
420 425 430 

Thr Arg Asn Asn Tyr Gly Ser Thr lie Glu Gly Leu Leu Asp Leu Pro 

435 440 445 

Asp Asp Asp Asp Ala Pro Ala Glu Ala Gly Leu Val Ala Pro Arg Met 
10 450 455 460 

Ser Phe Leu Ser Ala Gly Gin Arg Pro Arg Arg Leu Ser Thr Thr Ala 
465 470 475 480 

Pro lie Thr Asp Val Ser Leu Gly Asp Glu Leu Arg Leu Asp Gly Glu 
485 490 495 

15 Glu Val Asp Met Thr Pro Ala Asp Ala Leu Asp Asp Phe Asp Leu Glu 
500 505 510 

Met Leu Gly Asp Val Glu Ser Pro Ser Pro Gly Met Thr His Asp Pro 

515 520 525 

Val Leu Tyr Gly Ala Leu Asp Val Asp Asp Phe Glu Phe Glu Gin Met 
20 530 535 540 

Phe Thr Asp Ala Met Gly lie Asp Asp Phe Gly Gly 
545 550 555 



25 



(2) INFORMATION FOR SEQ ID NO: 112 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 



CGCGGGGGAG 


GGGACGACGC 


GGGGGAGGGG 


ACGACGCGGG 


GGAGGGGAGG 


ACGCGGGGGA 


60 


TATATAAAGC 


GGTACAAAGC 


GCGGGAATGG 


GCATATTGGA 


CCCGCGTGAT 


TCGGTTGCTC 


120 


GCGGTTGTCT 


TGTTTGGACG 


TTTTTTATGC 


GGGAACAAGG 


GGGCTTACCG 


GTTACACTGT 


180 


CCGCTCGCTA 


TGGGGTTCGT 


CTGTCTGTTT 


GGGCTTGTCG 


TTATGGGAGC 


CTGGGGGGCG 


240 


TGGGGTGGGT 


CACAGGCAAC 


CGAATATGTT 


CTTCGTAGTG 


TTATTGCCAA 


AGAGGTGGGG 


300 


GACATACTAA 


GAGTGCCTTG 


CATGCGGACC 


CCCGCGGACG 


ATGTTTCTTG 


GCGCTACGAG 


360 


GCCCCGTCCG 


TTATTGACTA 


TGCCCGCATA 


GACGGAATAT 


TTCTTCGCTA 


TCACTGCCCG 


420 


GGGTTGGACA 


CGTTTTTGTG 


GGATAGGCAC 


GCCCAGAGGG 


CGTATCTGGT 


TAACCCCTTT 


480 


CTCTTTGCGG 


CGGGATTTTT 


GGAGGACTTG 


AGTCACTCTG 


TGTTTCCGGC 


CGACACCCAG 


540 
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GAAACAACGA CGCGCCGGGC CCTTTATAAA GAGATACGCG ATGCGTTGGG CAGTCGAAAA 600 

CAGGCCGTCA GCCACGCACC CGTCAGGGCC GGGTGTGTAA ACTTTGACTA CTCACGCACT 660 

CGCCGCTGCG TCGGGCGACG CGATTTACGG CCTGCCAACA CCACGTCAAC GTGGGAACCG 720 

CCTGTGTCGT CGGACGATGA AGCGAGCTCG CAGTCGAAGC CCCTCGCCAC CCAGCCGCCC 780 

5 GTCCTCGCCC TTTCGAACGC CCCCCCACGG CGGGTCTCCC CGACGCGAGG TCGGCGCCGG 840 

CATACTCGCC TCCGACGCAA CTAGCCACGT CTGCATCGCA AGCCACCCTG GGTCGGGAGC 900 

AGGATATCCG ACCCGTCTAG CGGCCGGGTC GGCTGTCC AG CGTCGTCGCC CTAGAGGCTG 960 

TCCGCCGGGC GTGATGTTTT CCGCATCTAC GACCCCCGAA CAGCCCCTGG GGCTGTCGGG 1020 

CGATGCGACG CCGCCCCTGC CGACTTCCGT GCCCCTGGAC TGGGCCGCGT TTCGGCGCGC 1080 

10 GTTTCTGATC GACGACGCCT GGCGGCCCCT GTTGGAGCCG GAGCTCGCGA ACCCCCTAAC 1140 

CGCGCGCCTC CTCGCGGAGT ATGACCGTCG GTGCCAGACC GAAGAGGTGC TGCCGCCGCG 1200 

GGAGGATGTG TTCTCCTGGA CGCGGTATTG TACCCCCGAC GACGTGCGCG TGGTTATGAT 1260 

CGGGCAGGAC CCGTACCACC ATCCCGGCCA GGCGCACGGC CTGGCGTTTA GCGTGCGTGC 1320 

GGATGTGCCG GTGCCTCCGA GTCTACGGAA CGTGCTGGCG GCGGTAAAAA ATTGTTACCC 1380 

15 CGACGCGCGC ATGAGCGGCC GCGGCTGCCT GGAAAAGTGG GCTCGCGACG GCGTGCTGTT 1440 

GTTGAACACG ACCCTGACCG TCAAGCGCGG GGCGGCGGCG TCCCACTCCA AGCTTGGATG 1500 

GGACCGCTTT GTGGGCGGGG TGGTCCGACG GCTGGCCGCG CGCCGCCCGG GCCTGGTCTT 1560 

TATGCTCTGG GGCGCCCATG CCCAGAACGC GATCAGGCCC GACCCTCGCC AACACTACGT 1620 

CCTCAAGTTT TCTCACCCGT CGCCCCTCTC CAAGGTCCCG TTTGGGACGT GCCAGCATTT 1680 

20 CCTCGCCGCG AATCGCTACC TCGAAACCCG GGACATTATG CCTATCGACT GGTCGGTATA 1740 

AGATGCCGAC ATCCGGGGTC TTGATTTACG AGGGGGCAAT TAATAAAGAC TGTTGATGGT 1800 

TAAATCTCGG GTCTCATACC GGTCCGTGAT GTCGGGCGTG GGGGAAGAGA GGGTCCCCTC 1860 

TGCGTTTACT ATCCTTGCCT CGTGGGGCTG GACGTTTGCA CCCCAGAACC ATGATCCTGG 1920 

CGCGTCGCCG AATACGACGC CCATAGAGTC GATTGCGGGG ACCGCACCGG ACGCGCACGT 1980 

25 GGGGCCTCTC GACGGAGAGC CGGACCGGGA TGCGATCTCC CCGCTTACGT CGAGCGTGGC 2040 

CGGCGACCCG CCGGGGGCGG ACGGCCCCTA CGTCACCTTT GATACTCTGT TTATGGTATC 2100 

TTCGATCGAC GAACTGGGGC GCCGCCAGCT CACGGATACG ATCCGTAAGG ACCTGCGGCT 2160 

GTCGCTGGCC AAGTTCAGCA TCGCGTGTAC CAAGACCTCG TCGTTTTCGG GGACGGCCGC 2220 

GCGCCAGCGC AAGCGCGGAG CACCGCCGCA ACGCACATGC GTACCACGCA GCAACAAGAG 2280 

30 CCTCCAGATG TTCGTTTTGT GCAAGCGCGC CAACGCCGCG CAGGTGCGCG AGCAGCTGCG 2340 

GGCGGTTATT CGGTCGCGCA AGCCGCGCAA GTATTACACG CGGTCCTCGG ATGGGCGGCT 2400 

CTGCCCGGCC GTCCCCGTGT TTGTACACGA GTTTGTTTCG TCCGAACCCA TGCGCCTCCA 2460 

TCGAGATAAC GTCATGCTGT CTACGGAACC AGACTAAGCA CCCCCGCCGT CCCCTTTCTT 2520 

TTCCCCCTAC CCTTCCCCCC GTTACTGATG TGTTGTGACG TTTCAATAAA TAACACGTAG 2580 

35 CTTATTTTGT TGGATGATGG ATTGATTGAT TTTATTGACC GTTCGTTCGC CCGGCGGTGC 2640 

CGTCGCCGCG CGCAGAGGGA ATATGCAAGC GGGCGGGGTG GGGAGGAAAG AAGGTTTCAG 2700 

GTTCCGGGGG TTGGGTCTGC GTCGTCCAGG GTGGGGCTGA TCTGAATTTC CCGCAGAACC 2760 

TCGACCAGTA GGTCTGTTGT GTTTGCTGGG AACTCGCCCG CCGTTGGGGA TACGGGGGCG 2820 

GGGGG TGTGG TTGGGCGGAC GTCCAGGGGT GCGTTATCGC ACCCCCGCGC CGCCTCGGGG 2880 

40 GCCGTCCCGT AGATCGTTGC GGTGATGTAG ATGGTGTCCG GGGTCCACAC CACCGTCAGG 2940 

ATGCCGGCCG TCGCACTCCG GACGCTTTCG CCGTGCGATG AGCTGACCCA GGAGTCAAAG 3000 

GGGTACGCGT ACATATGGGC GTCCCACCAG CGCTCCAGCC TCTGGGTACT AGCGCGTCCT 3060 

ATAAAGCGGT ATGCGCAAAA TTCGGCAGGA CAGTCGATAA TCACCAGCAG CCCGATGGGG 3120 
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GTGTGTTGTA TCACCACGCC TCCGCGGGGC 
AGAACCGCGC GCGTCCCTGA CTCAAACACG 
CTCGTTAGCG ACGCCCTGGG GTGATGTAGG 
ATGTCTCGGG GGGGTGGGGG TGAATGTCAC 
5 ATGGAGGGTT AATAGACAAT GACCACATTC 
CTAATGACGT CATCGCGTTC GTGGCGCTCC 
TCGGATGAGG TGGTGCGGGA CATGGCTACG 
GTGAAGCATA TGGCGACCTT GTCCAGACTG 
AGTTTGGAGC TGATGCCGTA GTCGGCGTTG 

10 GACTCGACAA ACTCACTGAT GTTGGTATTG 
AGGACGATGT AGGGCAGGGG GGACTCCTCC 
CGCCGCCGCA GCTCCTCCGC GAACGCGAAC 
TAGTTGTCCG TCTGCAGGGC CACGGACATC 
TCGCAGCCCC GGAAGATGAC ATTGTCCACG 

15 TCCCCGAAGA GCTCCCGTAG GATAAGGTAT 
AACTGGGCGT GGACGGCCGC GGTGGTCTCC. 
ACGTCCAGCT GCTGTTCGTC CAGCCCGGCG 
TCGTCCGGGC CCCCGTCCCG CGGGCCCAGG 
GAGCGGTCGC TGGTGTCGGC GGCCCTGGTT 

20 GACAGGAGTT CTGCCGTCAG CTCCCCTAGG 
TCCAGGCCGG GGCGCTGGAG AAAGTTGTAA 
GACAGGAACC GGTAGGCGAA CTCCACCGAG 
TCGCGCAGCA CAGCCTCGAA GGTCCGAAAC 
AGGCGCGCGG TCACCGCGAC CTGGCTGTTG 

25 ACTAGCTGTT GCTTGCTGTG CACCTCACAG 
TGGGAGTAGT TGGTGATGCG ACTGGCGTTG 
GGTTGCTGCG TGAGCCGTCG ATACTCGTCA 
GGGAGGGTAA ACACAACAAA CTCCCCCTCG 
GCCATGTACG CGCTGACCTC CTTGTGGGAC 

30 GCCGGGTTGG TGATGTAACT TTCCGGGACG 
TCGGTGATGG GAAGGCCGTA CTCCAGCACC 
CATCGCTTGT TGTTAATGAA AATGGCCCAG 
GTGCGGTTGC AGATGAGGTA CGTGAGCACG 
TTTTGGTGTT CGAAGGTGGA CTCCAGCGAG 

35 AGCACCGGCC GCAGGCGGCC CGCGTACTGG 
CAGCAATACA CCACGGTCGT GAGTAGGTGC 
ATAATGTTGC TGCGGGTGAA AGCCGGCAGC 
AGGGCACCCT GGCCCAGCCC CAAAGTCTGC 
CGCGCGTCTT CGCCCCCGTG CGCCGCCAGG 

40 CAGTAGTACG TCAGGTCTCG CCGCTGCAGG 
GTGTACGGGT GCTGCCCCAG CTGGGCCTGG 
AAGATGGTGT TGATGGGTCG ACTCAGAAAC 
GCCGCGATTC GCGTGGCGCC CGTGACCACG 
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AGGCGGTCCT 


GGCGCGCTCG 


ACCCCGCGTC 


3180 


TGCACCACCT 


GTGCCGCGTC 


CGGCAGCGCG 


3240 


CTGTACGCGA 


TGGTCGTCTG 


GGGGTTCCCC 


3300 


CCGGCCCGGG 


TGCGGTGGGA ACGCGAGGGA 


3360 


GGATCGCGTA 


TV TV /*• TV *T> tv 

L» AoL AbA 1 At* 


J, A i GTGCTCG 


3420 


CGGAGCGGGT 


Tvp 7\ C* TV mmp TV rn 


G TGC- AGGAAC 


3480 


TACGCGCTGT 


rnm a r^r*f~<r*r* T\ rt 

1 IAC-KjLGLAG 


GTTTCCGGGC 


3540 


AGCCCCTGGG 


ACjC GCG IuA I 


GGTCATCGCG 


3600 


ATGGCCATGG 


UL-At-iC i V-UVj 1 


GG AGTC GATC 


3660 


ACGACAGACA 


TV" TV TV r^fT^r^TT^f* 

IXj AAGC CGTG 


C TGGTCCCGC 


3720 


AAGAACTCGG 


CCACGCCGGC 


CGTCGCGTGC 


3780 


ACCCGGGTGT 


ACGTGTACCC 


CATCAGCGTG 


3840 


AGCCCCCCGC 


GCGGCGAGCC 


GGTCAGCAGC 


3900 


TAGGTGCTGA 


AGGGGGCGCT 


CTCAAACACC 


3960 


CGCCCCAGAA 


AGGCCCTCTT 


CAGGAGCCCA 


4020 


GGCTCTTCGA 


GGGCGTAGTG 


GCAGTAGAAC 


4080 


AAGATAACGT 


CAAGGTCGTC 


GTCGGGGAAG 


4140 


TGCTTAAAAT 


TGAACGCACG 


CTCCCCCGGA 


4200 


GCCGATGCGC 


CGGCGGCGTC 


CCGGCGTAGC 


4260 


CGGCCGTAGG 


CCAGGGTCCT 


CTGGGTCGCG 


4320 


AAGTGAATCA 


GCCCGCCGAA 


CATGAGCCGC 


4380 


GTCTCCCCCT 


GGGTCTTL AC 


GAAGCTGTCG 


4440 


GTCCCGTCGA 


AC LC AAAL AC- 


CATCTTTCGG 


4500 


AGGACGTACG 


TV"* A rpr^rnr^r^rnrn 


C CGGGCC ACG 


4560 


CGCACGTGCC 


PPPPpipPPIV 

V-V-GCCj 1 L-L lvj 


GTCCTGACTC 


4620 


GCCGTGATCC 


ACTTTTCCAT 


GGTCAGCGTG 


4680 


AACTCTTTGA 


CCGACACAAA 


CGTAAGCACG 


4740 


CGAGTCACCT 


TTAGGTAGGC 


GTGGAGCTTG 


4800 


GAGAACAGCC 


GCGTCCACCC 


CGGAAGGTTG 


4860 


ACAAAGCGGT 


CCACAAACTG 


CATGTGCTCC 


4920 


TTCATGAGGT 


TCCCGAACTC 


GTGCTCCACA 


4980 


CTGTGCGAGA 


GGCGCGTGTA 


CTCGCGTAGG 


5040 


TTTTCGCTCT 


GCCGGACGGA 


GCATCGCAGT 


5100 


GCCGTCTGGG 


TCGGCGACCC 


CACGCACACC 


5160 


GGGGTGTGGT 


ACAGGGCGTT 


AATCATCCAC 


5220 


CGCCCCAGGA 


GCCCGGCCTC 


GTCGATGACG 


5280 


GCCCCGTGTG 


TGACCGAGGC 


CAGGCGCGTG 


5340 


TCTAGGGCGG TGAGGGCGTG GAACTCGTTT 


5400 


GCCCGCTTGG 


TGATGTCGAG 


GATCACCTCC 


5460 


TCTTCCAGCG 


AGGCGGGGCT 


GCTGGCCAGG 


5520 


ACGTGATTCC 


CGCGAAACCC 


GAACTCGTGA 


5580 


GCCCCCGAGA GCTTAACGTA 


CATGTTCTGC 


5640 


CAGTCCAGGA 


CCTCGTTGAG 


GGTCTGCACG 


5700 
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CACGTACTCT TTCCGGATCC GGCGTTGCCG GTGATGAGAT ACGCCGCGAA CGGAAACTCC 5760 

CGGAGCGGCA GGCCGGTCGG GACCTCCAAG GCCGCCACGT CCCGGAACCA CTGCAGGCGC 5820 

GGCACCTGCG TGACGTCGAG CTGCTGCTGC GAGAGCTCTC GGATGCGTGC GATGATTGGT 5880 

TGGACCCCGT GCATGGACGT AAAATTTAAA AACGCCTCGT CCCTGAACCG CACGGCGGGT 5940 

5 CTGGCCCCGG GCTGCTGTGG GGGCGGACCT GGTGCCCGGA CGTCCCGCGA GCCCTCCCCG 6000 

CCGGACGCCG CCATGGCCGC ACAGCGCGCG CGGGCGCCGG CGATGCGGAC GCGGGGCGGC 6060 

GACGCGGCGC TATGCGCCCC CGAGGACGGC TGGGTGAAGG TTCACCCCAC CCCCGGGACG 6120 

ATGTTGTTCC GCGAGATTCT CCTCGGGCAG ATGGGGTACA CCGAGGGTCA GGGGGTGTAC 6180 

AACGTCGTCC GGTCCAGCGA GGCCGCCACC CGACAGCTGC AGGCGGCGAT CTTCCACGCG 6240 

10 CTCCTCAACG CCACAACGTA CCGGGACCTG GAGGAGGACT GGCGCCGCCA CGTGGTGGCC 6300 

CGCGGCCTCC AGCCGCAGCG GCTGGTTCGC AGGTACCGGA ACGCCCGGGA GGGCGATATC 6360 

GCCGGGGTGG CCGAGCGGGT GTTCGACACG TGGCGATGCA CGCTCAGGAC GACGCTGCTG 6420 

GACTTTGCCC ACGGGGTGGT AGACTGCTTT GCGCCGGGCG GCCCAAGCGG ACCGACCAGC 6480 

TTCCCCAAAT ATATCGACTG GCTGACGTGT CTGGGGCTGG TTCCCATATT GCGCAAGACG 6540 

15 CGCGAGGGGG AGGCGACGCA GCGCCTGGGG GCGTTTCTCA GGCAGCACAC GCTGCCCCGG 6600 

CAGCTGGCCA CGGTCGCCGG GGCCGCGGAG CGCGCCGGCC CGGGGCTTCT GGAGCTGGCC 6660 

GTCGCGTTCG ACTCCACGCG CATGGCGGAA TACGACCGTG TGCACATCTA CTACAACCAT 6720 

CGCCGGGGGG AGTGGCTGGT GCGCGACCCG GTCAGCGGGC AGCGCGGCGA GTGCCTGGTG 6780 

CTGTGCCCCC CCCTGTGGAC CGGCGACCGC CTGGTCTTCG ATTCGCCCGT TCAGCGGCTG 6840 

20 TGCCCCGAGA TCGTCGCGTG CCACGCCCTC CGGGAACACG CGCACATCTG CCGTCTGCGC 6900 

AACACCGCGT CCGTCAAGGT GCTGTTGGGG CGCAAGAGCG ACAGCGAGCG CGGGGTGGCT 6960 

GGCGCCGCGC GGGTCGTCAA TAAGGCGCTG GGGGAGGATG ACGAGACGAA GGCCGGCTCG 7020 

GCCGCCTCGC GTCTCGTGCG GCTCATCATC AACATGAAGG GCATGCGCCA CGTGGGCGAC 7080 

ATCAACGACA CGGTACGCGC CTACTTGGAC GAGGCGGGGG GGCACCTGAT CGACACCCCC 7140 

25 GCCGTCGACC ACACCCTCCC TGGGTTCGGC AAGGGCGGCA CCGGCCGCGG GTCGGCGGCC 7200 

CAGGACCCGG GGGCGCGACC GCAGCAGCTT CGCCAGGCGT TTCAGACGGC CGTGGTCAAC 7260 

AACATCAACG GCATGCTGGA GGGCTATATC AATAATCTCT TTGGAACCAT AGAACGCCTG 7320 

CGAGAGACGA ACGCGGGTCT GGCGACCCAG CTGCAGGCGC G 7362 

30 (2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 



Met Arg Thr Pro Ala Asp Asp Val Ser Trp Arg Tyr Glu Ala Pro Ser 
15 10 15 
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Val He Asp Tyr Ala Arg He Asp Gly He Phe Leu Arg Tyr His Cys 

20 25 30 

Pro Gly Leu Asp Thr Phe Leu Trp Asp Arg His Ala Gin Arg Ala Tyr 
35 40 45 

5 Leu Val Asn Pro Phe Leu Phe Ala Ala Gly Phe Leu Glu Asp Leu Ser 
50 55 60 

His Ser Val Phe Pro Ala Asp Thr Gin Glu Thr Thr Thr Arg Arg Ala 
65 70 75 80 

Leu Tyr Lys Glu He Arg Asp Ala Leu Gly Ser Arg Lys Gin Ala Val 
10 85 90 95 

Ser His Ala Pro Val Arg Ala Gly Cys Val Asn Phe Asp Tyr Ser Arg 

100 105 110 

Thr Arg Arg Cys Val Gly Arg Arg Asp Leu Arg Pro Ala Asn Thr Thr 
115 120 125 

15 Ser Thr Trp Glu Pro Pro Val Ser Ser Asp Asp Glu Ala Ser Ser Gin 
130 135 140 

Ser Lys Pro Leu Ala Thr Gin Pro Pro Val Leu Ala Leu Ser Asn Ala 
145 150 155 160 

Pro Pro Arg Arg Val Ser Pro Thr Arg Gly Arg Arg Arg His Thr Arg 
20 165 170 175 

Leu Arg Arg Asn 
180 



25 



(2) INFORMATION FOR SEQ ID NO: 114: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

35 

Met Lys Arg Ala Arg Ser Arg Ser Pro Ser Pro Pro Ser Arg Pro Ser 

15 10 15 

Ser Pro Phe Arg Thr Pro Pro His Gly Gly Ser Pro Arg Arg Glu Val 
20 25 30 

40 Gly Ala Gly He Leu Ala Ser Asp Ala Thr Ser His Val Cys He Ala 
35 40 45 

Ser His Pro Gly Ser Gly Ala Gly Tyr Pro Thr Arg Leu Ala Ala Gly 
50 55 60 
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Ser Ala Val Gin Arg Arg Arg Pro Arg Gly Cys Pro Pro Gly Val Met 
65 70 75 80 

Phe Ser Ala Ser Thr Thr Pro Glu Gin Pro Leu Gly Leu Ser Gly Asp 
85 90 95 

5 Ala Thr Pro Pro Leu Pro Thr Ser Val Pro Leu Asp Trp Ala Ala Phe 
100 105 110 

Arg Arg Ala Phe Leu lie Asp Asp Ala Trp Arg Pro Leu Leu Glu Pro 

115 120 125 

Glu Leu Ala Asn Pro Leu Thr Ala Arg Leu Leu Ala Glu Tyr Asp Arg 
10 130 135 140 

Arg Cys Gin Thr Glu Glu Val Leu Pro Pro Arg Glu Asp Val Phe Ser 
145 150 155 160 

Trp Thr Arg Tyr Cys Thr Pro Asp Asp Val Arg Val Val lie lie Gly 
165 170 175 

15 Gin Asp Pro Tyr His His Pro Gly Gin Ala His Gly Leu Ala Phe Ser 
180 . 185 190 

Val Arg Ala Asp Val Pro Val Pro Pro Ser Leu Arg Asn Val Leu Ala 

195 200 205 

Ala Val Lys Asn Cys Tyr Pro Asp Ala Arg Met Ser Gly Arg Gly Cys 
20 210 215 220 

Leu Glu Lys Trp Ala Arg Asp Gly Val Leu Leu Leu Asn Thr Thr Leu 
225 230 235 240 

Thr Val Lys Arg Gly Ala Ala Ala Ser His Ser Lys Leu Gly Trp Asp 
245 250 255 

25 Arg Phe Val Gly Gly Val Val Arg Arg Leu Ala Ala Arg Arg Pro Gly 
260 . 265 270 

Leu Val Phe Met Leu Trp Gly Ala His Ala Gin Asn Ala lie Arg Pro 

275 280 . 285 

Asp Pro Arg Gin His Tyr Val Leu Lys Phe Ser His Pro Ser Pro Leu 
30 290 295 300 

Ser Lys Val Pro Phe Gly Thr Cys Gin His Phe Leu Ala Ala Asn Arg 
305 310 315 320 

Tyr Leu Glu Thr Arg Asp lie Met Pro lie Asp Trp Ser Val 
325 330 

35 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 231 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:115: 

5 Met Val Lys Ser Arg Val Ser Tyr Arg Ser Val Met Ser Gly Val Gly 
15 10 15 

Glu Glu Arg Val Pro Ser Ala Phe Thr He Leu Ala Ser Trp Gly Trp 

20 25 30 

Thr Phe Ala Pro Gin Asn His Asp Pro Gly Asp Asn Thr Thr Pro He 
10 35 40 45 

Glu Ser He Ala Gly Thr Ala Pro Asp Ala His Val Gly Pro Leu Asp 

50 55 60 

Gly Glu Pro Asp Arg Asp Ala He Ser Pro Leu Thr Ser Ser Val Ala 
65 70 75 80 

15 Gly Asp Pro Pro Gly Ala Asp Gly Pro Tyr Val Thr Phe Asp Thr Leu 

85 90 95 

Phe Met Val Ser Ser He Asp Glu Leu Gly Arg Arg Gin Leu Thr Asp 

100 105 110 

Thr lie Arg Lys Asp Leu Arg Leu Ser Leu Ala Lys Phe Ser He Ala 
20 115 120 125 

Cys Thr Lys Thr Ser Ser Phe Ser Gly Thr Ala Ala Arg Gin Arg Lys 

130 135 140 

Arg Gly Ala Pro Pro Gin Arg Thr Cys Val Pro Arg Ser Asn Lys Ser 
145 150 155 160 

25 Leu Gin Met Phe Val Leu Cys Lys Arg Ala Asn Ala Ala Gin Val Arg 

165 170 175 

Glu Gin Leu Arg Ala Val He Arg Ser Arg Lys Pro Arg Lys Tyr Tyr 

180 185 190 

Thr Arg Ser Ser Asp Gly Arg Leu Cys Pro Ala Val Pro Val Phe Val 
30 195 200 205 

His Glu Phe Val Ser Ser Glu Pro Met Arg Leu His Arg Asp Asn Val 

210 215 220 

Met Leu Ser Thr Glu Pro Asp 
225 230 
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(2) INFORMATION FOR SEQ ID NO: 116: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 199 amino acids 
40 <B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE 



TYPE: peptide 



(xi) SEQUENCE 



DESCRIPTION: SEQ ID NO: 116: 



5 Met Gly Asn Pro Gin Thr Thr lie Ala Tyr Ser Leu His His Pro Arg 
1 5 10 15 

Ala Ser Leu Thr Ser Ala Leu Pro Asp Ala Ala Gin Val Val His Val 

20 25 30 

Phe Glu Ser Gly Thr Arg Ala Val Leu Thr Arg Gly Arg Ala Arg Gin 
10 . 35 40 45 

Asp Arg Leu Pro Arg Gly Gly Val Val lie Gin His Thr Pro lie Gly 

50 55 60 

Leu Leu Val He He Asp Cys Arg Ala Glu Phe Cys Ala Tyr Arg Phe 
65 70 75 80 

15 He Gly Arg Ala Ser Thr Gin Arg Leu Glu Arg Trp Trp Asp Ala His 

85 90 95 

Met Tyr Ala Tyr Pro Phe Asp Ser Trp Val Ser Ser Ser His Gly Glu 

100 105 110 

Ser Val Arg Ser Ala Thr Ala Gly He Leu Thr Val Val Trp Thr Pro 
20 115 120 125 

Asp Thr He Tyr He Thr Ala Thr He Tyr Gly Thr Ala Pro Glu Ala 

130 135 140 

Arg Cys Asp Asn Ala Pro Leu Asp Val Arg Pro Thr Thr Pro Pro Ala 
145 150 155 160 

25 Pro Val Ser Pro Thr Ala Gly Glu Phe Pro Ala Asn Thr Thr Asp Leu 

165 170 175 

Leu Val Glu Val Leu Arg Glu He Gin lie Ser Pro Thr Leu Asp Asp 

180 185 190 

Ala Asp Pro Thr Pro Gly Thr 



30 
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(2) INFORMATION FOR SEQ ID 



NO: 117: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 877 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 117: 
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Met Ala . Ala Ser Gly Gly Glu Gly Ser Arg Asp Val Arg Ala Pro Gly 

1 5 10 15 

Pro Pro Pro Gin Gin Pro Gly Ala Arg Pro Ala Val Arg Phe Arg Asp 
20 25 30 

5 Glu Ala Phe Leu Asn Phe Thr Ser Met His Gly Val Gin Pro He He 
35 40 45 

Ala Arg He Arg Glu Leu Ser Gin Gin Gin Leu Asp Val Thr Gin Val 

50 55 60 

Pro Arg Leu Gin Trp Phe Arg Asp Val Ala Ala Leu Glu Val Pro Thr 
10 65 70 75 80 

Gly Leu Pro Leu Arg Glu Phe Pro Phe Ala Ala Tyr Leu He Thr Gly 

85 90 95 

Asn Ala Gly Ser Gly Lys Ser Thr Cys Val Gin Thr Leu Asn Glu Val 
100 105 HO 

15 Leu Asp Cys Val Val Thr Gly Ala Thr Arg He Ala Ala Gin Asn Met 
115 120 125 

Tyr Val Lys Leu Ser Gly Ala Phe Leu Ser Arg Pro He Asn Thr He 

130 135 140 

Phe His Glu Phe Gly Phe Arg Gly Asn His Val Gin Ala Gin Leu Gly 
20 145 150 155 160 

Gin His Pro Tyr Thr Leu Ala Ser Ser Pro Ala Ser Leu Glu Asp Leu 

165 170 175 

Gin Arg Arg Asp Leu Thr Tyr Tyr Trp Glu Val He Leu Asp He Thr 
180 185 190 

25 Lys Arg Ala Ala His Gly Gly Glu Asp Ala Arg Asn Glu Phe His Ala 
195 200 205 

Leu Thr Ala Leu Glu Gin Thr Leu Gly Leu Gly Gin Gly Ala Leu Thr 

210 215 220 

Arg Leu Ala Ser Val Thr His Gly Ala Leu Pro Ala Phe Thr Arg Ser 
30 225 230 235 240 

Asn He He Val He Asp Glu Ala Gly Leu Leu Gly Arg His Leu Leu 

245 250 255 

Thr Thr Val Val Tyr Cys Trp Trp Met He Asn Ala Leu Tyr His Thr 
260 265 270 

35 Pro Gin Tyr Ala Gly Arg Leu Arg Pro Val Leu Val Cys Val Gly Ser 
275 280 285 

Pro Thr Gin Thr Ala Ser Leu Glu Ser Thr Phe Glu His Gin Lys Leu 

290 295 300 

Arg Cys Ser Val Arg Gin Ser Glu Asn Val Leu Thr Tyr Leu He Cys 
40 305 310 315 320 

Asn Arg Thr Leu Arg Glu Tyr Thr Arg Leu Ser His Ser Trp Ala He 
325 330 335 

. Phe lie Asn Asn Lys Arg Cys Val Glu His Glu Phe Gly Asn Leu Met 
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340 345 350 

Lys Val Leu Glu Tyr Gly Leu Pro lie Thr Glu Glu His Met Gin Phe 

355 360. 365 

Val Asp Arg Phe Val Val Pro Glu Ser Tyr lie Thr Asn Pro Ala Asn 
5 370 375 380 

Leu Pro Gly Trp Thr Arg Leu Phe Ser Ser His Lys Glu Val Ser Ala 
385 390 395 400 

Tyr Met Ala Lys Leu His Ala Tyr Leu Lys Val Thr Arg Glu Gly Glu 
405 410 415 

10 Phe Val Val Phe Thr Leu Pro Val Leu Thr Phe Val Ser Val Lys Glu 
420 425 430 

Phe Asp Glu Tyr Arg Arg Leu Thr Gin Gin Pro Thr Leu Thr Met Glu 

435 440 445 

Lys Trp He Thr Ala Asn Ala Ser Arg He Thr Asn Tyr Ser Gin Ser 
15 450 455 460 

Gin Asp Gin Asp Ala Gly His Val Arg Cys Glu Val His Ser Lys Gin 
465 470 475 480 

Gin Leu Val Val Ala Arg Asn Asp He Thr Tyr Val Leu Asn Ser Gin 
485 490 495 

20 Val Ala Val Thr Ala Arg Leu Arg Lys Met Val Phe Gly Phe Asp Gly 
500 505 510 

Thr Phe Arg Thr Phe Glu Ala Val Leu Arg Asp Asp Ser Phe Val Lys 

515 520 525 

Thr Gin Gly Glu Thr Ser Val Glu Phe Ala Tyr Arg Phe Leu Ser Arg 
25 530 535 540 

Leu Met Phe Gly Gly Leu He His Phe Tyr Asn Phe Leu Gin Arg Pro 
545 550 555 560 

Gly Leu Asp Ala Thr Gin Arg Thr Leu Ala Tyr Gly Arg Leu Gly Glu 
565 570 575 

30 Leu Thr Ala Glu Leu Leu Ser Leu Arg Arg Asp Ala Ala Gly Ala Ser 
580 585 590 

Ala Thr Arg Ala Ala Asp Thr Ser Asp Arg Ser Pro Gly Glu Arg Ala 

595 600 605 

Phe Asn Phe Lys His Leu Gly Pro Arg Asp Gly Gly Pro Asp Asp Phe 
35 610 615 620 

Pro Asp Asp Asp Leu Asp Val He Phe Ala Gly Leu Asp Glu Gin Gin 
625 630 635 640 

Leu Asp Val Phe Tyr Cys His Tyr Ala Leu Glu Glu Pro Glu Thr Thr 
645 650 655 

40 Ala Ala Val His Ala Gin Phe Gly Leu Leu Lys Arg Ala Phe Leu Gly 
660 665 670 

Arg Tyr Leu He Leu Arg Glu Leu Phe Gly Glu Val Phe Glu Ser Ala 
675 680 685 
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Pro Phe Ser Thr Tyr Val Asp Asn Val lie Phe Arg Gly Cys Glu Leu 

690 695 700 

Leu Thr Gly Ser Pro Arg Gly Gly Leu Met Ser Val Gin Thr Asp Asn 
705 710 715 720 

5 Tyr Thr Leu Met Gly Tyr Thr Tyr Thr Arg Val Phe Ala Phe Ala Glu 

725 730 735 

Glu Leu Arg Arg Arg His Ala Thr Ala Gly Val Ala Glu Phe Leu Glu 

740 745 750 

Glu Ser Pro Leu Pro Tyr lie Val Leu Arg Asp Gin His Gly Phe Met 
10 755 760 765 

Ser Val Val Asn Thr Asn lie Ser Glu Phe Val Glu Ser lie Asp Ser 

770 775 780 

Thr Glu Leu Ala Met Ala lie Asn Ala Asp Tyr Gly lie Ser Ser Lys 
785 790 795 800 

15 Leu Ala Met Thr lie Thr Arg Ser Gin Gly Leu Ser Leu Asp Lys Val 

805 810 815 

Ala lie Cys Phe Thr Pro Gly Asn Leu Arg Leu Asn Ser Ala Tyr Val 

820 825 830 

Ala Met Ser Arg Thr Thr Ser Ser Glu Phe Leu His Met Asn Leu Asn 
20 835 840 845 

Pro Leu Arg Glu Arg His Glu Arg Asp Asp Val lie Ser Glu His lie 

850 855 860 

Leu Ser Ala Leu Arg Asp Pro Asn Val Val lie Val Tyr 
865 870 875 



25 
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(2) INFORMATION FOR SEQ ID NO: 118: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 588 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 



Met Val Leu Met Gly Arg Leu Arg 
1 5 
40 Met Phe Cys Ala Ala He Arg Val 
20 

Thr Ser Leu Arg Val Cys Thr His 
35 40 



Asn Ala Pro Glu Ser Leu Thr Tyr 

10 , 15 

Ala Pro Val Thr Thr Gin Ser Arg 
25 30 
Val Leu Phe Pro Asp Pro Ala Leu 
45 
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Pro Val Met Arg Tyr Ala Ala Asn Gly Asn Ser Arg Ser Gly Arg Pro 

50 55 60 

Val Gly Thr Ser Lys Ala Ala Thr Ser Arg Asn His Cys Arg Arg Gly 
65 70 75 80 

5 Thr Cys Val Thr Ser Ser Cys Cys Cys Glu Ser Ser Arg Met Arg Ala 

85 90 95 

Met lie Gly Trp Thr Pro Cys Met Asp Val Lys Phe Lys Asn Ala Ser 

100 105 110 

Ser Leu Asn Arg Thr Ala Gly Leu Ala Pro Gly Cys Cys Gly Gly Gly 
10 115 120 125 

Pro Gly Ala Arg Thr Ser Arg Glu Pro Ser Pro Pro Asp Ala Ala Met 

130 135 140 

Ala Ala Gin Arg Ala Arg Ala Pro Ala Met Arg Thr Arg Gly Gly Asp 
145 150 155 160 

15 Ala Ala Leu Cys Ala Pro Glu Asp Gly Trp Val Lys Val His Pro Thr 

165 170 175 

Pro Gly Thr Met Leu Phe Arg Glu He Leu Leu Gly Gin Met Gly Tyr 

180 185 190 

Thr Glu Gly Gin Gly Val Tyr Asn Val Val Arg Ser Ser Glu Ala Ala 
20 195 200 205 

Thr Arg Gin Leu Gin Ala Ala lie Phe His Ala Leu Leu Asn Ala Thr 

210 215 220 

Tyr Asp Leu Glu Glu Asp Trp Arg Arg His Val Val Arg Leu Gin Pro 
225 230 235 240 

25 Gin Arg Leu Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly Asp He Ala 

245 250 255 

Gly Val Ala Glu Arg Val Phe Asp Thr Trp Arg Cys Thr Leu Arg Thr 

260 265 270 

Thr Leu Leu Asp Phe Ala His Gly Val Val Asp Cys Phe Ala Pro Gly 
30 275 280 285 

Gly Pro Ser Gly Pro Thr Ser Phe Pro Lys Tyr He Asp Trp Leu Thr 

290 295 300 

Cys Leu Gly Leu Val Pro He Leu Arg Lys Thr Arg Glu Gly Glu Ala 
305 310 315 320 

35 Thr Gin Arg Leu Gly Ala Phe Leu Arg Gin His Thr Leu Pro Arg Gin 

325 330 335 

Leu Ala Thr Val Ala Gly Ala Ala Glu Arg Ala Gly Pro Gly Leu Leu 

340 345 350 

Glu Leu Ala Val Ala Phe Asp Ser Thr Arg Met Ala Glu Tyr Asp Arg 
40 355 360 365 

Val His lie Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu Val Arg Asp 

370 375 380 

Pro Val Ser Gly Gin Arg Gly Glu Cys Leu Val Leu Cys Pro Pro Leu 
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385 390 395 400 

Trp Thr Gly Asp Arg Leu Val Phe Asp Ser Pro Val Gin Arg Leu Cys 

405 - 410 415 

Pro Glu lie Val Ala Cys His Ala Leu Arg Glu His Ala His lie Cys 
5 420 425 430 

Arg Leu Arg Asn Thr Ala Ser Val Lys Val Leu Leu Gly Arg Lys Ser 

435 440 445 

Asp Ser Gly Val Ala Gly Ala Ala Arg Val Val Asn Lys Ala Leu Gly 
450 455 460 

10 Glu Asp Asp Glu Thr Lys Ala Gly Ser Ala Ala Ser Arg Leu Val Arg 
465 470 475 480 

Leu lie lie Asn Met Lys Gly Met Arg His Val Gly Asp He Asn Asp 

485 490 495 

Thr Val Arg Ala Tyr Leu Asp Glu Ala Gly Gly His Leu He Asp Thr 
15 500 505 510 

Pro Ala Val Asp His Thr Leu Pro Gly Phe Gly Lys Gly Gly Thr Gly 

515 520 525 

Arg Gly Ser Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin Gin Leu Arg 
530 535 540 

20 Gin Ala Phe Gin Thr Ala Val Val Asn Asn He Asn Gly Met Leu Glu 
545 550 555 560 

Gly Tyr He Asn Asn Leu Phe Gly Thr He Glu Arg Leu Arg Glu Thr 

565 570 575 

Asn Ala Gly Leu Ala Thr Gin Leu Gin Ala Arg Val 
25 580 585 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 21035 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 



GTCTGGCCGC CGGCCCTGGC GTACGCGCTA TATAAGCCCA TGCGGTATTG GATGAGTTCC 60 

CGCGCGCCCC GGAACTCCTC CACCGCCCAC GGGGCCAGGT CCGCGGCCGC CGCGTCGAAC 120 

40 TCCGCCAGCA GGCGCCCCAG GGCGTCAAAG TTCATCTCCC AGGGCACCCT GCGCACCACC 180 

TCATCCCGCA GCCGGGCGCA CAGGGCGGTG TGCTTGGTGA CGCGCGCGCC CAGCTCCTCC 240 

ACGGCCTCCG CGCGCTCGGC GCCCTTGGCG CCCAGGACGC CCTGGTACCT GGCGGAAAGG 300 

CGCTCGTAGG CCGGCTGGGC CCGCAGCCCC GACACCGTGT TGGTGGTGTC CTGCAGGGCG 360 
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CGCAGCTGCT CGTGCATGGC GCGGAACCCC TCGGGGGACT TCCAGGCGCC CCCCCGGACG 420 

CGGCCAAAGC GACCCCAGAC CTCGTCCCAC TCCGCCTCGG CCTCCTCCAG GGACCTCCGC .480 

AGGGCGTCGA CGCGGCGCCG AGTATCAAAG AGCGCCCCCA GGCGGCCGGC GTGCCGCGCC 540 
AGGGGGCCGG GGCCGTCGCC GCGGGCGGCG CTTAGCGGGT GCGTCTCGAA GGTGCGCTGG . .600 

5 GCGTGCTCTA GCCAGATAAC CGCGGGCACG TCGAGCTCGC GCGTTTTCTC GGTCTGATCC 660 

AACAGAACCT CGACCTGGTC GGCGATCTCC GCCACCGAGC GCGCCTGGTC GAGCGTCTTG 720 

GCCACGGTCG CCGGGACGGC GACCACCTTC AGCATGGTCT TGAGGTTGGC CAGGCCCTCG 780 

GCCTCGATCT GGGCCCGGCG CTCGCGCGCG GCCAGCGCCT CCCGCAGGCC CGCCATGACC 840 

CGCTCGGTGG CCTCCGCGCG CTGCTGTTTG GCGCGCACCA CTGCGTCCTT GGTCTCGGCC 900 

10 GTGTCCTGCC GGGTCACGAA GGCGACATAC TCGGCGTACG CCGTGTTCTT CACGGGGCTC 960 

TGGTCCACGC GCTCCAACGC CGCCGCGCAC GCGACCAGCG CGTCCTCGCT GGGACACGGC 1020 

AGGGTGACCC CGGTCCGGAC CAGCTCCGCG GTGGCCTCCG GGTCATTCCG GGCCGCGGAT 1080 

ATCTGCTCCG CGGCGGCCGC CAGGTCCAGG GGCACGCCGC CGAGCGCCCG GTGCACGTCG 1140 

GCCCGGATGG CGTCCAGGCG ATCGCGGAGC TCCACGTAGT GGGCGTAGCC ATGTTGGAAG 1200 

15 AACGGCACGT ACCGGCGCAG GCCGGGCACG CTCGTCATGT CGTCCGCCAG GCGCCCCACG 1260 

GCCTCGTGGT AGTCGATAAA CCCGTCGCCC GCCTGGGCCA TTTCCAGGAG CCCCTCCGCG 1320 

ATGCGCAGCA GCCGCGCCAG GGGCTCGGCG TCGACCCGAA ACATGTCGGC GTAGGTTTCG 1380 

GCGGCGGCGT GGAACGCCGC GCTCCAGCCG AGGCGGTGGA TGGCGGCGAG CGGGGGGAGC 1440 

ATGGGGTGGC GCTGGTTCTC GGGGGTGTAG GGGTTAAACG CGAAGGCCGT ATCCAGGGCG 1500 

20 AGGGTGACCG CCTCGGCGTT GGCCGCGAGC GCCTGCTCGG CGCGCTTGCG GAAGTCCCGG 1560 

GGGTTGTAGC CGTGCGTGCC CGCCAGCGCC TGCAGGCGGC GCAGCTCGAC CACGTCGAAC 1620 

TCGGCGCGGT TCTCGACGCG GTCCAGCGCC GCCTCGACGC CGGCGGCCCA GCGCTCGCTG 1680 

CTGCCCCGGG CGCGCTGGGC CGCCATCTTC GCCGTCAGGT CGGCGACGGC GGCCTCAAGT 1740 

TCGTCGGCGC GGCGTCGCGT GGCGCCGATG ACCTTGCCCA GCTCCTGCAG GGCGCGCCCG 1800 

25 CTGGGGGAAT GGTCCCCGGC CGTCCCTTCG GCGTGCAGCA GGCCCCCGAA CCCAGCCTCG ■ 1860 

TGCCCCGCGA GGCTTTCCCG AGCAGCGGTC GTCGCGCGGG CCGCGGCATC GATGAGGGCG 1920 

GCATGGTCCC CCTCCGGCTG GGCGCAGGCC CGGCGCGCCT GGACTACCAG GTCGGCGGCC 1980 

GCCGACCCCA GGGTCGTGAG CTCGTCGATG GCCCCCCGCG CCTCCAGGGC CAGCCGAGTC 2040 

GCCTTTACAT ACCCCGCGGC GCTATCGGCC AGCGCCGCGA GGAAGGACAG GGGCGAGGCC 2100 

30 GGGTCGCGGG CGGCCGCGCC CAGGGCCGAC ACCGCGTCCG CCAGGGCGCC ATGCGCCCGC 2160 

ACGGCCGCGT CCACCGTCGC CGCGGGACTT GCCGTCGCGA CGGCGGCGCT CCCGGCGTTG 2220 

ATGGCGTTTG ACACGGCTTT GGCGATTGTG GGGGCGTGAT CGGAAAAGAA CTGCACGAGG 2280 

ACCGGCGTCT CGGGGGCGTC GGCGAACAGG GTCTTCAGCA CCACCACGAA GGCGGGATGC 2340 

AGGCCGGCCA GAGCCGTCGC GGTATCCGGG GTCGGGTGTT CCAGGGCCTC CCGGTACTGC 2400 

35 CCCAGCAGCC CCCACAGGTC CGCCCGCAGC GCCGCCGTGA CTTCCGGGGG GGGGCCCCGG 2460 

ACGGCATCGG CCAGGTCGGT CCACCCCGCG GGCAGGGAGG CCCGCAGGGT CGCCAGCACG 2520 

GCCGGACACG CCTTTAGCCC CACAAAGTCC GGGAGGGGCC GCAGGACCCC TTGGAGTTTG 2580 

TGCAGGAACT TCTCCCGGGC GTCGTGGGCC ACCTTGGCGC GCTCCCGCGC GTCGTTGAGC 2640 

ATCGCCTCCA GGGCGTGGGC GCGCTCCCGA AGCCGGGAGC GCGCCTCCGG AGCGAGCTCC 2700 

40 GCCGTCATCT TGGCCGCCTC CATGGCCCTC GCCTGCCGCA GCGCGTCTTC GGCCATGCGC 2760 

GTGGCCTCGG GGGACAGCCC GCCCCCGTCG ACGTACGGCG CGGGGCCGGT CGCCGGGACG 2820 

AAGGCCGCGT CGCTGTCCAG CTGCTGCGCG AGCGCCGCGT CGAGGGCGTC GAAGCGCTGC 2880 

AGTTCGGCCA GCCCCGAGCT GCGCCGCGCC TGCTGGTCGT TGATGCCGTG GATGCTGCGC 2940 
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AGGGACGAGC GCATGCACGA TACCGACCCC CCCGGCTCCA GATCGGTCGC GAACTGGTTC 5580 

CGAACACCGG TGACCACGAT ATCGCGATCC CCCTGGCGCT TCATCGTGGG GTGAGGTAGC 5640 

GCGGCCGGAA TCATGTGTGC CGCGCCCGCC ACGAGCGGGG CCTGTTTATG GGCCGGGCGT 5700 

CCCGATGAGT ACTGTTGTTT CCGCCGCCCG AACCCCCCCC GCCCATCAAC CGCCTGTTCG 5760 

5 TCCCCCTAAC CACACACCCG GTATCGCGTG TGTGGTTTCC CGGGAAGCCA CATCCCACCC 5820 

CATGAAGTTT TGCCCTTTTT TTCCGTCCCG CACTACGCCA CCTTTCCACC CCCCCCCCCC 5880 

AAAAAAAAAA AAACAACAAC CAACTCCCAG ATGGATGGGT GCGATAATAA AGCTTTATTA 5940 

TTGTTTAACC AAAGGCGAGT CCTACGGGTG TACCGGTGGT GTCTCCTGCG GCGTCATCTC 6000 

GTCGTCCTCC ACGGGGGTGT TGGGCCAAGG GACCGTCTCG CGGCCCGCCG GGCGCGTCGA 6060 

10 CGGCGCGCGG GCCTGCGTGT CCTGTGGGCC GGGTGTCGTG GGTTCGGGGG TGCTACCGCC 6120 

GGCATCTTGG GCCTCCAGGT CCCCGGGGGC CTCCGGGCGG GCGGAAGGCC GAAACGCCGA 6180 

GGCGCGAAAC ACGCCGTCGG TGACCTGCAG GAGCTCGTTT ATTAATAGCC AGTCCATGCT 6240 

CAGCGTAGCG GCCAGCCCCT GGGGAGACAG GTCCACGGAG TCCGGAACCA CCGTCGGCTG 6300 

ACCCAGGGGC CCCAGGCTGT AGTCCCCCCA GGCCCCCAGG TCATGACGGT TCGTGAGCAC 6360 

15 GACGAGGTCT GCGGCCGGGC TGGGGGGCGC GTCCTCGGTC GCGTGGGCCA TCACCTCCTG 6420 

AATGGCTGCG GTGCGCTGAT CGGCCGAGCT GGCGAAGGGC GCCACGACCA GCGCGCGCTC 6480 

CGTCTGCAGG CCCTTCCACG TGTCGTGGAG TTCCTGAACG AACTCGGCCA CCCGCTCGGG 6540 

GCCCGTGGCC GCGCGCGCGG CCTGATAGCC GGCCGAGAGG CGCCGCCAGC GCGCCAGGAA 6600 

CTGACTCATG TAACAGAACC CGGGGACCTG GTCCCCCGAC ATCAACTTTG ACGCCCTGGC 6660 

20 GTGGATGCCC GACACGATGG CCAGGAACCC GTGGATTTCC CGCCGCACGA CGGCCAGCAC 6720 

GTTACCCTCG TGCGAGACCT GGGCCGCCAG CTCGTCGCAT ACCCCGAGGT GCGCCGTCGT 6780 

CTCGGTGACG ACGGACCGCA GCCCCGCGAG GGACGCGACC AGCGCGCGCT TGGCGTCGTG 6840 

ATACATGCCG CAGTACTGGC TCACCGCGTC GCCCATGGCC TCGGGGCGCC AGGGCCCCAG 6900 

GCGCTCGTGG GCGTCTGCGA CCACGGCGTA CAGGCGGTGC CCGTCGCTCT CGAACCGGCA 6960 

25 CTCAAAGAAG GCGGCGAGCG TGCGCATGTG CAGCCGCAGC AGCACGATCG CGTCCTCCAG 7020 

CTGGCGGACC AGGGGGTCGG CGCGCTCGGC GAGCTCCTGC AGCACCCCCC GGGCCGCCAG 7 080 

GGCGTACATG CTGATCAGCA GCAGGCTGCT GCCCACCTCG GGAGGCTGGG GGGGAGGCAG 7140 

CTGGACCGCG GGCCGCAGCT GCTCGACGGC CCCCCTGGCG ATCACGTACA GCTCGCGCAG 7200 

CAGCTGCTCG ATGTTGTCGG CCATCTGCAT CGTGGGCCCG ACGCCGGCCC GGGTGGCCGG 7260 

30 TTCGAGGAGG GTGATCAGCG CGCCCAATTT TGTGCGGTGC CCCTCGACGG TGGGGAGATA 7320 

GCCCAGGCCG AAGTCGCGCG CCCAGGCCAG CACCCGCAGG GCAAACTCGA TGGGGCGGGG 7380 

CAGGTAGGCA GCGTTGCACG TGGCCCTCAG CGCGTCCCCG ACCACCAGGG CCAGCACGTA 7440 

AGGGACGAAC CCCGGGTCGG CGAGGACGTT GGGGTGGATG CCCTCCAGGG CCGGGAAGCG 7500 

GATCTTGGTG GCCGCGGCCA GGTGAACCGA GGGGGCGTGG CTAGGCGGCC CGACGGGGAG .7560 

35 CAGCGCGGAC AGCGGCGTGG CCGGGGTGGT GGGGGTCAGG TCCCAGTGGG TCTGGCCGTA 7620 

CACGTCGAGC CAGATGAGCG CCGTCTCGCG CAGGAGGCTG GGCTGGCCGG CGCTGAAGCG 7680 

GCGCTCGGCC GTCTCAAACT CCCCCACGAG CGTGCGCCGC AGGCTCGCCA GGTGTTCCGT 7740 

CGGCACGGCC GGGCCCATGA TGCGCGCCAG CGTCTGGCTG AGGACGCCGC CCGACAGGCC 7800 

GACCGCCTCA CAGAGCCGCC CGTGCGTGTG CTCGCTGGCG CCCTGGATCC GCCGGAACGT 7860 

40 TTTCACGTAG CCGGCGTAGT GCCCGTACTC CCGCGCGAGC CCGAACACGT TCGCCCCCGC 7920 

AAGGGCAATG CACCCAAAGA GCTGCTGGAT CTCGCTGAGC CCGTGGCCGG GGGGCGTCCG 7980 

CGCGGGCACC CCCGCCACCA AAAACCCCTC CAGGGCCGAT ATGTACTGGG TGCAGTGCGC 8040 

GGGCGTGAAC CCCGCGTCGG TAAGCGTGTT GATCACCACG GAGGGCGAGT TGCTGTTCTG 8100 
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GACCAAAGCC 


CACGTCTGCT 


GCAGCAGCGC 


GAGGAGCCGT 


TGCTGGGCCC 


CGGCGGAGGG 


8160 


CGGCTCCCCT 


AGCTGCAGCA 


GGCCGGTGAC 


GGCCGGACGG 


AAGATGGCCA 


GCGCCGACGC 


8220 


ACTCAGAAAC 


GGCACGTCGG 


GGTCGAAGAC 


GGCCGCGTCC 


GTCCGCACGC 


GCGCCATCAG 


8280 


CGTCCCCGGG 


GGCGCGCACG 


CCGACCGCGG 


GCTGACGCGG 


CTTAGGGCGG 


TCGACACGCG 


8340 


CACCTCCTCG 


CGACTGCGAA 


CCATTTTGGT 


GGCCTCGAGG 


GGCGGGATCA 


TGATAGCCGG 


8400 


GTCGATCTCC 


CGCACCGTGT 


GCTGAAACTG 


GGCCAGCAGC 


GGCGGCGGGA 


CCACCGCGCC 


8460 


CCGATCGGGG 


GTCGTCAGGT 


ACTCGTCCAC 


CAGCGCCAGC 


GTAAACAGGG 


CCCGCGTGAG 


8520 


GGGGGTCAGG 


GCGGCGTCGT 


CGATGCGCTG 


TAGGTGCGCC 


GAGAACAGCG 


TCACCCAATT 


8580 


GCTGACCAGG 


GCCAAGAACC 


GGAGACCCTC 


TTGCACGATC 


GGGGACGGGA 


AGAGCAGGCT 


8640 


GTACGCCGGG 


GTGGTCAGGT 


TGGCGCCGGG 


TTGCCCCAGG 


GGAACCGGGG 


ACATCTTAAG 


8700 


CGACATCTCC 


CCGAGGGCCT 


CCAGGGAGGT 


CCGCGGGTTC 


ATGGCCAGGC 


AGCTCTGGGT 


8760 


GACGGTCCGC 


CAGCGGTCGA 


TCCACTCCAC 


GGCACACTGG 


CGGACGCGCA 


CCGGCCCCAG 


8820 


GGCCGCCGTG 


GTGCGCAGCC 


CGGCGGCCTC 


CAGCGCGTGG 


GTCGTGTCGG 


AGCCGGTGAT 


8880 


CGCCAGGACC 


GTGTCCTTGA 


TGACGTCCAT 


CTCCCGGAAG 


GCCGCCTCGG 


GGGTCTCGGG 


8940 


GAGCGCCACC 


GCCATGCGGT 


GCACCAGCAG 


CCCGGGGAGG 


TTCTCGGCCA 


AGAGCGCCGT 


9000 


CTCCGGAAGC 


CCGTGGGCCC 


GGTGCAAGGC 


GCACAGTTGC 


TCCAGGAGCG 


GGTGCCAGCA 


9060 


CGCCCGCGCC 


TCCGCCGGGC 


CGACCGCCGC 


GCCCGACAAC 


AGAAACGCCG 


CCGTGGCGGC 


9120 


GCGCAGTTTG 


GCCGCGGACA 


GAAACGCCGG 


CTCGTCCGCG 


CTGCCCGCCG 


GCTCGCTCGA 


9180 


GGGGGAGGGC 


GGCCGGCGGA 


GGTTGGTCAG 


GCTCCCCAAC 


AGGACCTGCA 


ACGGTCCGTT 


9240 


TGGGGGTGGA 


GCGGACGGGG 


GGGTCATGCC 


GGCGGGCGCC 


GGGACCTGGA 


GCGCGCTGTC 


9300 


CGACATGGCG 


ACCGGCGTGC 


GCGCTCGGCG 


ACGCGGCGCG 


GAGACCGCGG 


GCCCAAACGG 


9360 


GAATGACTGC 


CGCCGCCCTA 


TACGGAGGGG 


CTAAGTATCG 


CCCGGGGACC 


CTTCGAAACC 


9420 


CCGGGCGTGT 


CGCAAGTACG 


CCGCGAAGGC 


GCGGCGTGTT 


ATACGGCGCG 


TTATGTCCCG 


9480 


GCATTCCGTT 


CGTGGGTTCG 


GGCCCGGGTG 


CTGTCGGGTG 


GGAGTGTGTG 


TGTGTGGGGG 


9540 


GGGGGCGGCG 


CGACGGCGGC 


CCGGACCAAG 


TGTATCGCGG 


CCGTTCCGTG 


GGGCGGCCCA 


9600 


ACAGGCCCTT 


TAAACATTTG 


CGTATGCACC 


GGCCCAGCCA 


GTCGGACACC 


GGAACCCACC 


9660 


AGAGGCGGAA 


GCCGCCTTCG 


CCCGTGAGGG 


TGCGTGTGTT 


TTCTGGTGGC 


GTGTTTTTCC 


9720 


TTTCCGCCCT 


CCTCCCTCCC 


CACCTCCACC 


ACCCCCCCCC 


CACAACTCGC 


CCGTTGGCGA 


9780 


TCGGCGGGAA 


AACCATGAAA 


ACCAAGCCAC 


TCCCGACAGC 


CCCGATGGCG 


TGGGCCGAGA 


9840 


GTGCCGTGGA 


AACCACCACC 


AGCCCGCGCG 


AGCTCGCGGG 


CCACGCCCCG 


CTCCGGCGCG 


9900 


TCCTGCGCCC 


GCCCATCGCT 


CGCCGCGACG 


GCCCGGTGCT 


TTTGGGGGAC 


AGGGCCCCCA 


9960 


GGAGGACGGC 


CAGTACGATG 


TGGCTGCTGG 


GGATCGACCC 


CGCGGAGTCG 


TCTCCGGGAA 


10020 


CGCGCGCTAC 


CCGAGACGAT 


ACCGAGCAGG 


CCGTGGACAA 


GATCCTCAGG 


GGAGCCCGGC 


10080 


GCGCGGGAGG 


GCTGACCGTC 


CCCGGCGCCC 


CCCGCTATCA 


CCTGACCCGC 


CAGGTAACCC 


10140 


TGACGGATCT 


CTGCCAACCA 


AACGCGGAGC 


GGGCCGGGGC 


GCTCCTTTTG 


GCCCTGCGGC 


10200 


ACCCCACCGA 


CCTCCCCCAC 


CTGGCCCGCC 


ATCGGGCTCC 


GCCCGGCCGG 


CAGACCGAGC 


10260 


GACTGGCCGA 


GGCCTGGGGC 


CAGCTCCTGG 


AGGCCTCCGC 


CCTGGGGTCC 


GGGCGGGCCG 


10320 


AGAGCGGCTG 


CGCGCGCGCG 


GGCCTTGTGT 


CGTTTAACTT 


TCTGGTGGCC 


GCGTGCGCCG 


. 10380 


CCGCCTACGA 


TGCGCGCGAC 


GCCGCCGAGG 


CGGTCCGGGC 


CCACATCACG 


ACCAACTACG 


10440 


GCGGGACGCG 


GGCCGGGGCG 


CGGCTGGACC 


GGTTTTCCGA 


ATGCCTGCGC 


GCCATGGTCC 


10500 


ACACGCACGT 


GTTTCCCCAC 


GAGGTCATGC 


GGTTTTTCGG 


GGGGCTAGTG 


TCGTGGGTCA 


10560 


CACAGGACGA 


GCTGGCTAGC 


GTCACCGCCG 


TCTGCAGCGG 


ACCCCAGGAG 


GCCACACACA 


10620 


CCGGCCACCC 


GGGCAGGCCC 


TGTTCGGCCG 


TTACCATCCC 


GGCCTGCGCC 


TTCGTGGACC 


10680 
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TGGACGCCGA 


GCTGTGCCTG 


GGGGGCCCTG 


GGGCGGCGTT 


CCTGTACTTG 


GTCTTCACCT 


10740 


ACCGACAGTG 


CCGGGACCAG 


GAGCTCTGTT 


GCGTGTACGT 


GGTCAAGAGC 


CAGCTCCCCC 


10800 


CGCGCGGACT 


GGAGGCGGCC 


CTCGAGCGGC 


TGTTCGGGCG 


CCTCCGGATA 


ACCAACACGA 


10860 


TTCACGGGGC 


CGAGGACATG 


ACGCCCCCTC 


CCCCGAACCG 


AAACGTTGAC 


TTTCCGCTCG 


10920 


CCGTCCTGGC 


CGCGAGCTCG 


CAATCCCCGC 


GGTGCTCGGC 


GAGCCAAGTC 


ACGAACCCCC 


10980 


AGTTTGTCGA 


CAGGCTGTAC 


CGCTGGCAGC 


CGGATCTGCG 


GGGGCGCCCT 


ACCGCACGCA 


11040 


CCTGCACATA 


CGCCGCCTTC 


GCAGAGCTGG 


GTGTCATGCC 


AGACAACAGC 


CCCCGCTGTC 


11100 


TGCACCGCAC 


CGAGCGGTTT 


GGGGCGGTCG 


GCGTTCCGGT 


TGTCATCCTG 


GAGGGCGTGG 


11160 


TGTGGCGCCC 


CGGCGGGTGG 


CGGGCCTGCG 


CGTGATCGTC 


TATTGACGAC 


GGCCGCCCAA 


11220 


CCCGAGCGAC 


CTTCCCCTCC 


CACTTTCCCC 


CCCCCCCCTC 


CTACACACCA 


ACTCCGCCCT 


11280 


CGCCGTCTTG 


GCCGTGCGCG 


GCCCCGTGCG 


TCCGTCTCAA 


TAAAGCCAGG 


TTAAATCCGT 


11340 


GACGTGGTGT 


GTTTGGCGTG 


TGTCTCTGAA 


ATGGCGGAAA 


CCGACATGCA 


AATGGGATTC 


11400 


ATGGACACGT 


TACACCCCCC 


TGACTCAGGA 


GATAGGCATA 


TCCTCCTTAG 


ATTGACTCAG 


11460 


CACACGATCG 


CACCCCACCC 


CTGTGTGCCG 


GGGATAAAAG 


CCAACGCGGG 


CGGTCTGGGT 


11520 


TACCACAACA 


GGTGGGTGCT 


TCGGGGACTT 


GACGGTCGCC 


ACTCTCCTGC 


GAGCCCTCAC 


11580 


GTCTTCGCCC 


ACCGATTCCT 


GTTGCGTTCC 


TGTCGGCCGG 


TGCTGTCCTG 


TCGACAGATT 


11640 


GTTGGCGACT 


GCCCGGGTGA 


TTCGTCGGCC 


GGTGCGTCCT 


TTCGGTCGTA 


CCGCCCACCC 


11700 


CGCCTCCCAC 


GGGCCCGCCG 


CTGTTTCCGT 


TCATCGCGTC 


CGAGCCACCG 


TCACCTTGGT 


11760 


TCCAATGGCC 


AACCGCCCTG 


CCGCATCCGC 


CCTCGCCGGA 


GCGCGGTCTC 


CGTCCGAACG 


11820 


ACAGGAACCC 


CGGGAGCCCG 


AGGTCGCCCC 


CCCTGGCGGC 


GACCACGTGT 


TTTGCAGGAA 


11880 


AGTCAGCGGC 


GTGATGGTGC 


TTTCCAGCGA 


TCCCCCCGGC 


CCCGCGGCCT 


ACCGCATTAG 


11940 


CGACAGCAGC 


TTTGTTCAAT 


GCGGCTCCAA 


CTGCAGTATG 


ATAATCGACG 


GAGACGTGGC 


12000 


GCGCGGTCAT 


TTGCGTGACC 


TCGAGGGCGC 


TACGTCCACC 


GGCGCCTTCG 


TCGCGATCTC 


12060 


AAACGTCGCA 


GCCGGCGGGG 


ATGGCCGAAC 


CGCCGTCGTG 


GCGCTCGGCG 


GAACCTCGGG 


12120 


CCCGTCCGCG 


ACTACATCCG 


TGGGGACCCA 


GACGTCCGGG 


GAGTTCCTCC 


ACGGGAACCC 


12180 


AAGGACCCCC 


GAACCCCAAG 


GACCCCAGGC 


TGTCCCCCCG 


CCCCCTCCTC 


CCCCCTTTCC 


12240 


ATGGGGCCAC 


GAGTGCTGCG 


CCCGTCGCGA 


TGCCAGGGGC 


GGCGCCGAGA 


AGGACGTCGG 


12300 


GGCCGCGGAG 


TCATGGTCAG 


ACGGCCCGTC 


GTCCGACTCC 


GAAACGGAGG 


ACTCGGACTC 


12360 


CTCGGACGAG 


GATACGGGCT 


CGGGTTCGGA 


GACGCTGTCT 


CGATCCTCTT 


CGATCTGGGC 


12420 


CGCAGGGGCG 


ACTGACGACG 


ATGACAGCGA 


CTCCGACTCG 


CGGTCGGACG 


ACTCCGTGCA 


12480 


GCCCGACGTT 


GTCGTTCGTC 


GCAGATGGAG 


CGACGGCCCT 


GCCCCCGTGG 


CCTTTCCCAA 


12540 


GCCCCGGCGC 


CCCGGCGACT 


CCCCCGGAAA 


CCCCGGCCTG 


GGCGCCGGCA 


CCGGGCCGGG 


12600 


CTCCGCGACG 


GACCCGCGCG 


CGTCGGCCGA 


CTCCGATTCC 


GCGGCCCACG 


CCGCCGCACC 


12660 


CCAGGCGGAC 


GTGGCGCCGG 


TTCTGGACAG 


CCAGCCCACT 


GTGGGAACGG 


ACCCCGGCTA 


12720 


CCCAGTCCCC 


CTAGAACTCA 


CGCCCGAGAA 


CGCGGAGGCG 


GTGGCGCGGT 


TTCTGGGGGA 


12780 


CGCCGTCGAC 


CGCGAGCCCG 


CGCTCATGCT 


GGAGTACTTC 


TGTCGGTGCG 


CCCGCGAGGA 


12840 


GAGCAAGCGC 


GTGCCCCCAC 


GAACCTTCGG 


CAGCGCCCCC 


CGCCTCACGG 


AGGACGACTT 


12900 


TGGGCTCCTG 


AACTACGCGC 


TCGCTGAGAT 


GCGACGCCTG 


TGCCTGGACC 


TTCCGCCGGT 


12960 


CCCCCCCAAC 


GCATACACGC 


CCTATCATCT 


GAGGGAGTAT 


GCGACGCGGC 


TGGTTAACGG 


13020 


GTTCAAACCC 


CTGGTGCGGC 


GGTCCGCCCG 


CCTGTATCGC 


ATCCTGGGGA 


TTCTGGTTCA 


13080 


CCTGCGCATC 


CGTACCCGGG 


AGGCCTCCTT 


TGAGGAATGG 


ATGCGCTCCA 


AGGAGGTGGA 


13140 


CCTGGACTTC 


GGGCTGACGG 


AAAGGCTTCG 


CGAACACGAG 


GCCCAGCTAA 


TGATCCTGGC 


13200 


CCAGGCCCTG 


AACCCCTACG 


ACTGTCTGAT 


CCACAGCACC 


CCGAACACGC 


TCGTCGAGCG 


13260 
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GGGGCTGCAG 


TCGGCGCTGA 


AGTACGAAGA 


GTTTTACCTC 


AAGCGCTTCG 


GCGGGCACTA 


13320 


CATGGAGTCC 


GTCTTCCAGA 


TGTACACCCG 


CATCGCCGGG 


TTCCTGGCGT 


GCCGGGCGAC 


13380 


CCGCGGCATG 


CGCCACATCG 


CCCTGGGGCG 


ACAGGGGTCG 


TGGTGGGAAA 


TGTTCAAGTT 


13440 


CTTTTTCCAC 


CGCCTCTACG 


ACCACCAGAT 


CGTGCCGTCC 


ACCCCCGCCA 


TGCTGAACCT 


13500 


CGGAACCCGC 


AACTACTACA 


CGTCCAGCTG 


CTACCTGGTA 


AACCCCCAGG 


CCACCACTAA 


13560 


CCAGGCCACC 


CTCCGGGCCA 


TCACCGGCAA 


CGTGAGCGCC 


ATCCTCGCCC 


GCAACGGGGG 


13620 


CATCGGGCTG 


TGCATGCAGG 


CGTTCAACGA 


CGCCAGCCCC 


GGCACCGCCA 


GCATCATGCC 


13680 


GGCCCTGAAG 


GTCCTGGACT 


CCCTGGTGGC 


GGCGCACAAC 


AAACAGAGCA 


CGCGCCCCAC 


13740 


CGGGGCGTGC 


GTGTACCTGG 


AACCCTGGCA 


CAGCGACGTT 


CGGGCCGTGC 


TCAGAATGAA 


13800 


GGGCGTCCTC 


GCCGGCGAGG 


AGGCCCAGCG 


CTGCGACAAC 


ATCTTCAGCG 


CCCTCTGGAT 


13860 


GCCGGACCTG 


TTCTTCAAGC 


GCCTGATCCG 


CCACCTCGAC 


GGCGAGAAAA 


ACGTCACCTG 


13920 


GTCCCTGTTC 


GACCGGGACA 


CCAGCATGTC 


GCTCGCCGAC 


TTTCACGGCG 


AGGAGTTCGA 


13980 


GAAGCTGTAC 


GAGCACCTCG 


AGGCCATGGG 


GTTCGGCGAA 


ACGATCCCCA 


TCCAGGACCT 


14040 


GGCGTACGCC 


ATCGTGCGCA 


GCGCGGCCAC 


CACCGGAAGC 


CCCTTCATCA 


TGTTTAAGGA 


14100 


CGCGGTAAAC 


CGCCACTACA 


TCTACGACAC 


GCAAGGGGCG 


GCCATTGCCG 


GCTCCAACCT 


14160 


CTGCACGGAG 


ATCGTCCACC 


CGTCCTCCAA 


ACGCTCCAGC 


GGGGTCTGCA 


ACCTGGGCAG 


14220 


CGTGAATCTG 


GCCCGATGCG 


TCTCCCGGCG 


GACGTTCGAT 


TTTGGCATGC 


TCCGCGACGC 


14280 


CGTGCAGGCG 


TGCGTGCTAA 


TGGTTAATAT 


CATGATAGAC 


AGCACGCTGC 


AGCCGACGCC 


14340 


CCAGTGCGCC 


CGCGGCCACG 


ACAACCTGCG 


GTCCATGGGC 


ATTGGCATGC 


AGGGCCTGCA 


14400 


CACGGCGTGC 


CTGAAGATGG 


GCCTGGATCT 


GGAGTCGGCC 


GAGTTCCGGG 


ACCTGAACAC 


14460 


ACACATCGCC 


GAGGTGATGC 


TGCTCGCGGC 


CATGAAGACC 


AGTAACGCGC 


TGTGCGTTCG 


14520 


CGGGGCGCGT 


CCCTTCAGCC 


ACTTTAAGCG 


CAGCATGTAC 


CGGGCCGGCC 


GCTTTCACTG 


14580 


GGAGCGCTTT 


TCGAACGCCA 


GCCCGCGGTA 


CGAGGGCGAG 


TGGGAGATGC 


TACGCCAGAG 


14640 


CATGATGAAA 


CACGGCCTGC 


GCAACAGCCA 


GTTCATCGCG 


CTCATGCCCA 


CCGCCGCCTC 


14700 


GGCCCAGATC 


TCGGACGTCA 


GCGAGGGCTT 


TGCCCCCCTG 


TTCACCAACC 


TGTTCAGCAA 


14760 


GGTGACCAGG 


GACGGCGAGA 


CGCTGCGCCC 


CAACACGCTC 


TTGCTGAAGG 


AACTCGAGCG 


14820 


CACGTTCGGC 


GGGAAGCGGC 


TCCTGGACGC 


GATGGACGGG 


CTCGAGGCCA 


AGCAGTGGTC 


14880 


TGTGGCCCAG 


GCCCTGCCTT 


GCCTGGACCC 


CGCCCACCCC 


CTCCGGCGGT 


TCAAGACGGC 


14940 


CTTCGACTAC 


GACCAGGAAC 


TGCTGATCGA 


CCTGTGTGCA 


GACCGCGCCC 


CCTATGTTGA 


15000 


TCACAGCCAA 


TCCATGACTC 


TGTATGTCAC 


AGAGAAGGCG 


GACGGGACGC 


TCCCCGCCTC 


15060 


CACCCTGGTC 


CGCCTTCTCG 


TCCACGCATA 


TAAGCGCGGC 


CTGAAGACGG 


GGATGTACTA 


15120 


CTGCAAGGTT 


CGCAAGGCGA 


CCAACAGCGG 


GGTGTTCGCC 


GGCGACGACA 


ACATCGTCTG 


15180 


CACAAGCTGC 


GCGCTGTAAG 


CAACAGCGCT 


CCGATCGGGG 


TCAGGCGTCG 


CTCTCGGTCC 


15240 


CGCATATCGC 


CATGGATCCC 


GCCGTCTCCC 


CCGCGAGCAC 


CGACCCCCTA 


GATACCCACG 


15300 


CGTCGGGGGC 


CGGGGCGGCC 


CCGATTCCGG 


TGTGCCCCAC 


CCCCGAGCGG 


TACTTCTACA 


15360 


CCTCCCAGTG 


CCCCGACATC 


AACCACCTTC 


GCTCCCTCAG 


CATCCTGAAC 


CGCTGGCTGG 


15420 


AGACCGAGCT 


CGTGTTCGTG 


GGGGACGAGG 


AGGACGTCTC 


CAAGCTCTCC 


GAGGGCGAGC 


15480 


TCGGCTTCTA 


CCGCTTTCTG 


TTTGCCTTCC 


TGTCGGCCGC 


GGACGACCTG 


GTGACGGAAA 


15540 


ACCTGGGCGG 


CCTCTCCGGC 


CTCTTCGAAC 


AGAAGGACAT 


TCTTCACTAC 


TACGTGGAGC 


15600 


AGGAATGCAT 


CGAGGTCGTC 


CACTCGCGCG 


TCTACAACAT 


CATCCAGCTG 


GTGCTCTTTC 


15660 


ACAACAACGA 


CCAGGCGCGC 


CGCGCCTATG 


TGGCCCGCAC 


CATCAACCAC 


CCGGCCATTC 


15720 


GCGTCAAGGT 


GGACTGGCTG 


GAGGCGCGGG 


TGCGGGAATG 


CGACTCGATC 


CCGGAGAAGT 


15780 


TCATCCTCAT 


GATCCTCATC 


GAGGGCGTCT 


TTTTTGCCGC 


CTCGTTCGCC 


GCCATCGCGT 


15840 
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ACCTGCGCAC CAACAACCTC CTGCGGGTCA 
ACGAGGCCGT GCATACGACA GCCTCGTGCT 
CCAAGCCCGA GGCGGCGCGC GTGTACCGGC 
GGTTCATCCG ATCCCAGGCC CCGACGGACA 
5 CCATCGAGAA CTACGTGCGA TTCAGCGCGG 
CCCTGTATTC CGCCCCCGCC CCCGACGCCA 
AACACACCAA CTTCTTCGAG TGCCGCAGCA 
TGTGAGGGTC TGGGCGCCCT TGTAGCGATG 
TGTTGGGTCT CCGGTGTGAT TATTACGCAG 

10 GGAACGCCCG AAACCAGAGA AAAGGACCAA 
CGCCGACCAG AACCCCGAGA TGCATAATAA 
GGTCGGGCAT CGGGAGGGGA TGGGGGCGCG 
AATTTAGCCA GGACGTCCTT GTAAAACGCG 
AGAAACCGGT CGGCGATGTC CGGGGCGGTG 

15 TCTTCGTCGC GGAGGTCCTG ATAGATGGGC 
TCCTTGGGGC TGATAAGCGA TATGACGTAC 
ATGGTCATCG GATCGGGCAG CCAGTCCAGG 
CGACGTCCGG CGACATAGCC GCGGTGTTCC 
AGCTCGGGCG GGGTGAGTAT CTCCGAGGAG 

20 GCGACGTCCG GGGGCTGGAG GGGGGGGTCT 
TGGGCCAGAA TTTCGGTCCA CGAGATGCGC 
GTAGGCATGC TCTCCAGGGA GCGCGAGTTG 
TGGGATCGGC TCGGGGCGGT CCAGTGACAC 
GTGTTATTGG GGTGCAGGTC TGTGTGGCAG 

25 CTCATCTTGA AGTACTGCAG CAGGTCGCGG 
ATGTCCAACA CGATATCGCA GCCCATCAGC 
GCGACCGTGT TGGTATGATA GAGGTTCGCG 
TTGATGTAGG CGTACCCCAG CGCCCGCAGA 
AGGGCCGGCT TCGAGGGCGC GCCCCAGGGG 

30 CCCCGGTCCC CCGGGGCGAA GGCGTGCCCG 
GGGCCTGGAG TCGGTGATGG GGGAAGGCGG 
GTCCGTGCGG CACTGGGCCG TCGTGCGGGC 
CTCGGTACAC TCGACCCCGC GATCGGTCAC 
CACCCGTAAC ATACTACAGA GACAGTGTAG 

35 GCGGCGCTGA TATTTAACCA CCAGGGTATA 
GGTAAAGTAG CCCTCCGGGG CCCGGAGGCC 
AAACTTCATC ATGCCAAACA GACCCATTCC 
TACAGAGCTG TATAGGTGTG ACGGTCCGGG 
GGCTGCGGCG ACGACCCCGC TCCAACAAGA 

40 ATTCTTTTTT ATTTCCCATC TACGTGCGGA 
GGCCGACCAT CTCTCTCTTC CCCCCCTCCC 
CTAACTAGCG GAAGGCGTAT TTAACCAGAC 
TCGGGTAGCC ACTGCTCTGT GGCTCGGGTC 



CCTGCCAGTC 


GAACGACCTC 


ATCAGCCGCG 


15900 


ACATCTACAA 


CAACTACCTC 


GGGGGCCACG 


.15960 


TGTTTCGGGA 


GGCGGTGGAT 


ATCGAGATCG 


16020 


GCTCTATCCT 


GAGTCCGGGG 


GCCCTGGCGG 


16080 


ATCGCCTGCT 


GGGCCTGATC 


CATATGCAGC 


16140 


GCTTTCCCCT 


CAGCCTCATG 


TCCACCGACA 


16200 


CCTCGTACGC 


CGGGGCCGTC 


GTCAACGATC 


16260 


TCTAACCGAA 


ATAAAGGGGT 


CGAAACGGAC 


16320 


GGGAGGGGGG 


TGGCGGCTGG 


GGAAAGGGAA 


16380 


AAGGGAAACG 


CGTCCAACCG 


ATAAATCAAG 


16440 


CAAACGATTT 


TATTACTCTT 


ATTATTAACA 


16500 


CGTTTCCTCC 


GTTCCGGCTA 


CTCGTCCCAG 


16560 


GGCGGGGGCG 


CGTGGGCCCA 


CAGCTGCGCC 


16620 


ATATGCCGAG 


TCACGATGGA 


GCGCGCTAAA 


16680 


AGTCTTTTTA 


GAAGAGTCCA 


GGGTCCCCGC 


16740 


TTGACGTATC 


TGTGCTCCAC 


CAGCTCGGCG 


16800 


GCCTCCGGGG 


CGTCGTGGAT 


GACGTGGCGG 


16860 


GCGACCCGCT 


GCGCGTTGGG 


GACCTGCACG 


16920 


GACGACCGGG 


CGCCGTCGCG 


CGGCCCACCG 


16980 


TCTTCGTAGT 


CGTCCTCGCC 


CGCGATCTGT 


17040 


GTCTCGAGGC 


CGACCGGGGC 


CGCGGTCAGC 


17100 


GCGCGCTCCC 


GCCGGGCCGC 


CCGGCGGGCC 


17160 


TCGCGCAGCA 


CGTCCTCGAC 


GGACGCGTAG 


17220 


CGGACGAACA 


GCGCCAGGAA 


CTGCGGGTAA 


17280 


CAGTGAATCG 


TCGGAATGTA 


GCCGGTGCTG 


17340 


AGGAGATCGG 


TATCCGTGGT 


ATGCACGTAC 


17400 


CAGGCGTCGT 


CGGCCTCCAG 


CTGACCCGAG 


17460 


ACGCGGATAC 


AGAACAGGTG 


AGCCAGGCGC 


17520 


GCCGCCGGGC 


CTGGGCCGGC 


GGCCCGCGTT 


17580 


CGGCGGCGCA 


TGTTGGAAAA 


AGGCGAAACT 


17640 


CGGCGAGGCG 


TCTACGTCAC 


TGGCCTCCTC 


17700 


CAGGATCGCC 


TTGGCCCCGA 


ACACAACCGG 


17760 


GAAGATGGGG 


AACAGGGACT 


TTTGGGT AAA 


17820 


CGTGATTGCC 


TCGCGGTCGT 


AACTTGGGTA 


17880 


CATGACATTC 


CACAGGTCCA 


CGGCGATGGG 


17940 


CCGGCGCTTC 


ACCAGATGGT 


GAGTCTGGGC 


18000 


GGCACGATTG 


TAGGTGCGGA 


TAGGTCTCTC 


18060 


ACACCCAAGC 


CCGCCGCCCC 


TGTGTACAGT 


18120 


CGCTATCCCG 


GGAAAGGCAC 


GCTCTTTATA 


18180 


TTGGTGCAAC 


CGCCGGCGCG 


CGCCGGTGCA 


18240 


CCTCCCCCGA 


GCCCTCAAAG 


AGGGTGTGGC 


18300 


TAGGGCGGCG 


GGTCCGCCGT 


AGTCCTTGGC 


18360 


CCCCGGCCCC 


CCTAACCCCC 


ATCCGGTCCG 
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CGTCATCCGC 


CCCCTCCGCC 


TGCGACACAA 


ACGGCCGCGC 


CTCCGGGCCC 


GGTGACACGA 


18480 


CGCGCCTCGT 


CTCTGCGGAT 


TGTCCCGGGA 


GCGTCGCGGC 


ATGGCTCATC 


TTCCCGGCGG 


18540 


TGCGGCCGCC 


GCCCCCCTTT 


CGGAGGACGC 


GATCCCGTCG 


CCGCGCGAGC 


GGACGGAAGA 


18600 


CTGGCCGCCC 


TGCCAGATAG 


TGCTGCAGGG 


CGCCGAGCTG 


AACGGGATCC 


TGCAGGCCTT 


18660 


TGCGCCGCTT 


CGCACGAGCC 


TTTTGGACTC 


GCTCCTGGTC 


GTGGGCGACC 


GAGGCATCCT 


18720 


TGTACATAAC 


GCGATTTTCG 


GCGAGCAGGT 


GTTTCTGCCC 


CTCGACCATT 


CGCAGTTCAG 


18780 


TCGCTATCGA 


TGGGGCGGAC 


CCACCGCGGC 


GTTCCTGTCT 


CTCGTGGACC 


AGAAGCGATC 


18840 


CCTGCTGAGC 


GTGTTTCGCG 


CCAACCAGTA 


CCCTGACCTG 


CGGCGGGTGG 


AGCTGACGGT 


18900 


CACGGGCCAG 


GCCCCGTTTC 


GCACGCTGGT 


GCAGCGCATA 


TGGACGACCG 


CGTCCGACGG 


18960 


AGAGGCCGTG 


GAGCTTGCCA 


GCGAGACGGT 


CATGAAACGC 


GAGTTGACGA 


GCTTCGCGGT 


19020 


ACTACTCCCC 


CAGGGCGACC 


CCGACGTCCA 


GCTGCGCCTC 


ACGAAGCCCC 


AGCTCACGAA 


19080 


GGTGGTG AAC 


GCCGTCGGGG 


ACGAGACCGC 


CAAACCCACC 


ACGTTCGAGC 


TCGGCCCCAA 


19140 


CGGCAAGTTT 


TCCGTGTTTA 


ACGCGCGCAC CTGCGTCACC 


TTTGCCGCCC 


GCGAGGAGGG 


19200 


CGCGTCGTCC 


AGCACCAGCG 


CCCAGGTCCA 


GATTCTGACC 


AGCGCGCTGA 


AGAAGGCGGG 


19260 


CCAGGCGGCC 


GCCAACGCCA 


AGACGGTCTA 


CGGGGAAAAC 


ACACACCGCA 


CATTCTCGGT 


19320 


GGTCGTCGAC 


GACTGCAGCA 


TGCGGGCGGT 


CCTCCGGCGG 


CTCCAGGTCG 


GCGGGGGGAC 


19380 


CCTCAAGTTC 


TTCCTCACGG 


CCGACGTCCC 


CAGCGTGTGT 


GTCACCGCCA 


CCGGCCCCAA 


19440 


CGCGGTGTCG 


GCGGTGTTTC 


TTTTAAAACC 


CCAGCGGGTC 


TGCCTGAACT 


GGCTCGGCCG 


19500 


GACCCCGGGT 


TCCTCGACCG 


GGAGCTTGGC 


GTCCCAGGAC 


TCTCGGGCCG 


GCCCGACCGA 


19560 


CAGCCAGGAC 


TTCTCCTCCG 


AGCCGGACGC 


GGGCGACCGC 


GGCGCCCCAG 


AAGAAGAAGG 


19620 


CCTCGAGGGC 


CAGGCCCGGG 


TCCCGCCCGC 


GTTCCCGGAA 


CCGCCGGGAA 


CCAAGCGGAG 


19680 


GCACGCCGGG 


GCCGAAGTTG 


TCCCCGCGGA 


CGACGCCACC 


AAGCGCCCGA 


AGACGGGCGT 


19740 


GCCCGCCGCC 


CCCACGCGAG 


CCGAGTCGCC 


CCCCCTCTCC 


GCGAGATACG 


GACCCGAGGC 


19800 


GGCGGAGGGT 


GGTGGGGACG 


GCGGCCGCTA 


CGCGTGCTAC 


TTTCGCGACC 


TCCAGACCGG 


19860 


CGACGCGAGC 


CCCAGCCCCC 


TCTCCGCCTT 


CCGGGGTCCC 


CAAAGACCCC 


CATACGGCTT 


19920 


TGGGTTGCCC 


TGACGGCGAC 


GGGTGGTGGC 


CGAACGCTTC 


ACCGCGCCCG 


GGCACGCGGG 


19980 


GTGCGTTQTG 


TTAAAAAAAT 


AAATAAATGG 


GGTAGTGTGT 


CCCCCCCCTC 


CAACCAATAT 


20040 


GGCTGTCGTG 


TGTGGTTCCG 


GGTTGCGCCT 


CCGTCCTTTC 


CACCCCCCTT 


CCCCCTCCTT 


20100 


TTTTGTTTTG 


CGTGCGCTTA 


TAAGAGCGGG 


CCCGGGGCCC 


TTCGCAGCTT 


CACCGAGAGC 


20160 


GCCGTCGGGC 


CCCGGGTGCG 


GGATGTGTCG 


CGGGGACAGC 


CCCGGGGTCG 


CGGGCGGGAG 


20220 


CGGCGAACAC 


TGCCTCGGAG 


GGGATGATGG 


GGACGACGGG 


CGCCCCCGCC 


TCGCCTGCGT 


20280 


GGGTGCCATC 


GCTCGGGGGT 


TCGCGCATCT 


CTGGCTCCAG 


GCCACCACGC 


TGGGCTTCGT 


20340 


GGGGTCTGTC 


GTTCTGTCGC 


GCGGCCCGTA 


TGCGGACGCC 


ATGTCGGGGG 


CGTTCGTGAT 


20400 


CGGGAGCACC 


GGCCTGGGGT 


TCCTCCGCGC 


CCCCCCCGCG 


TTCGCCCGGC 


CGCCGACGCG 


20460 


TGTGTGCGCG 


TGGCTGAGGC 


TGGTCGGCGG 


GGGAGCGGCC 


GTGGCCCTGT 


GGAGCCTCGG 


20520 


GGAGGCCGGC 


GCGCCTCCGG 


GGGTTCCGGG 


CCCGGCGACC 


CAGTGCCTGG 


CGCTCGGGGC 


20580 


CGCCTACGCG 


GCGCTGCTGG 


TGCTGGCCGA 


CGACGTCCAT 


CCCCTTTTCC 


TCCTCGCCCC 


20640 


GCGGCCCCTG 


TTTGTCGGCA 


CCCTGGGGGT 


TGTCGTCGGC 


GGGCTGACGA 


TAGGCGGCAG 


20700 


TGCGCGCTAC 


TGGTGGATCG 


ACCCCCGCGC 


CGCCGCGGCC 


CTGACGGCGG 


CGGTGGTGGC 


20760 


GGGCCTCGGG 


ACAACCGCCG 


CCGGGGACAG 


CTTTTCCAAG 


GCCTGTCCCC 


GCCACCGCCG 


20820 


CTTTTGCGTC 


GTCTCCGCGG 


TCGAGTCTCC 


CCCGCCCCGA 


TACGCCCCGG 


AGGACGCCGA 


20880 


GCGGCCAACA 


GACCACGGAC 


CCCTGTTACC 


GTCGACGCAC 


CACCAGCGAT 


CTCCGCGGGT 


20940 


CTGCGGCGAC 


GGGGCCGCAC 


GGCCCGAAAA 


CATCTGGGTT 


CCCGTGGTGA 


CCTTTGCGGG 


21000 
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CGCGCTCGCG CTGGCCGCCT GCGCCGCGCG AGGG 21035 
(2) INFORMATION FOR SEQ ID NO: 120: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1850 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

15 Val Ala Gly Ala Ala His Met lie Pro Ala Ala Leu Pro His Pro Thr 
15 10 15 

Met Lys Arg Gin Gly Asp Arg Asp lie Val Val Thr Gly Val Arg Asn 

20 25 30 

Gin Phe Ala Thr Asp Leu Glu Pro Gly Gly Ser Val Ser Cys Met Arg 
20 35 40 45 

Ser Ser Leu Ser Phe Leu Ser Leu Leu Phe Asp Val Gly Pro Arg Asp 

50 55 60 

Val Leu Ser Ala Glu Ala lie Glu Gly Cys Leu Val Glu Gly Gly Glu 
65 70 75 80 

25 Trp Thr Arg Ala Ala Ala Gly Ser Gly Pro Pro Arg Met Cys Ser He 

85 90 95 

He Glu Leu Pro Asn Phe Leu Glu Tyr Pro Ala Arg Gly Leu Arg Cys 

100 105 110 

Val Phe Ser Arg Val Tyr Gly Glu Val Gly Phe Phe Gly Glu Pro Thr 
30 115 120 125 

Ala Gly Leu Leu Glu Thr Gin Cys Pro Ala His Thr Phe Phe Ala Gly 

130 135 140 

Pro Trp Ala Met Arg Pro Leu Ser Tyr Thr Leu Leu Thr He Gly Pro 
145 150 155 160 

35 Leu Gly Met Gly Arg Asp Gly Asp Thr Ala Tyr Leu Phe Asp Pro His 

165 170 175 

Gly Leu Pro Ala Gly Thr Pro Ala Phe He Ala Lys Val Arg Ala Gly 

180 185 190 

Asp Val Tyr Pro Tyr Leu Thr Tyr Tyr Ala His Asp Arg Pro Lys Val 
40 195 200 205 

Arg Trp Ala Gly Ala Met Val Phe Phe Val Pro Ser Gly Pro Gly Ala 

210 215 220 

Val Ala Pro Ala Asp Leu Thr Ala Ala Ala Leu His Leu Tyr Gly Ala 
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225 230 235 240 

Ser Glu Thr Tyr Leu Gin Asp Glu Pro Phe Val Glu Arg Arg Val Ala 

245 - 250 255 

lie Thr His Pro Leu Arg Gly Glu lie Gly Gly Leu Gly Ala Leu Phe 
5 260 265 270 

Val Gly Val Val Pro Arg Gly Asp Gly Glu Gly Ser Gly Pro Val Val 

275 280 285 

Pro Ala Leu Pro Ala Pro Thr His Val Gin Thr Pro Arg Ala Asp Arg 
290 295 300 

10 Pro Pro Glu Ala Pro Arg Gly Ala Ser Gly Pro Pro Asn Thr Pro Gin 
305 310 315 320 

Ala Gly His Pro Asn Arg Pro Pro Asp Asp Val Trp Ala Ala Ala Leu 

325 330 335 

Glu Gly Thr Pro Pro Ala Lys Pro Ser Ala Pro Asp Ala Ala Ala Ser 
15 340 345 350 

Gly Pro Pro His Ala Ala Pro Pro Pro Gin Thr Pro Ala Gly Asp Ala 

355 360 365 

Ala Glu Glu Ala Glu Asp Leu Arg Val Leu Glu Val Gly Ala Val Pro 
370 375 380 

20 Val Gly Arg His Arg Ala Arg Tyr Ser Thr Gly Leu Pro Lys Arg Arg 
385 390 395 400 

Arg Pro Thr Trp Thr Pro Pro Ser Ser Val Glu Asp Leu Thr Ser Gly 

405 410 415 

Glu Arg Pro Ala Pro Lys Ala Pro Pro Ala Lys Ala Lys Lys Lys Ser 
25 420 425 430 

Ala Pro Lys Lys Lys Ala Pro Val Ala Ala Glu Val Pro Ala Ser Ser 

435 440 445 

Pro Thr Pro lie Ala Ala Thr Val Pro Pro Ala Pro Asp Thr Pro Pro 
450 455 460 

30 Gin Ser Gly Gin Gly Gly Gly Asp Asp Gly Pro Asp Ser Ser Pro Ser 
465 470 475 480 

Val Leu Glu Thr Leu Gly Ala Arg Arg Pro Pro Glu Pro Pro Gly Ala 

485 490 495 

Asp Leu Ala Gin Leu Phe Glu Val His Pro Asn Val Ala Ala Thr Ala 
35 500 505 510 

Val Arg Leu Ala Ala Arg Asp Ala Ala Arg Glu Val Ala Ala Cys Ser 

515 520 525 

Gin Leu Thr lie Asn Ala Leu Arg Ser Pro Tyr Pro Ala His Pro Gly 
530 535 540 

40 Leu Leu Glu Leu Cys Val lie Phe Phe Phe Glu Arg Val Leu Ala Phe 
545 550 555 560 

Leu He Glu Asn Gly Ala Arg Thr His Thr Gin Ala Gly Val Ala Gly 
565 570 575 
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Pro Ala Ala Ala Leu Leu Asp Phe Thr Leu Arg Met Leu Pro Arg Lys 

580 585 590 

Thr Ala Val Gly Asp Phe Leu Ala Ser Thr Arg Met Ser Leu Ala Asp 
595 600 605 

5 Val Ala Ala His Arg Pro Leu He Gin His Val Leu Asp Glu Asn Ser 
610 615 620 

Gin He Gly Arg Leu Ala Lys Leu Val Leu Val Ala Arg Asp Val He 
625 630 635 640 

Arg Glu Thr Asp Ala Phe Tyr Gly Asp Leu Ala Asp Leu Asp Leu Gin 
10 645 650 655 

Leu Arg Ala Ala Pro Pro Ala Asn. Leu Tyr Ala Arg Leu Gly Glu Trp 

660 665 670 

Leu Leu Glu Arg Ser Arg Ala His Pro Asn Thr Leu . Phe Ala Pro Ala 
675 680 685 

15 Thr Pro Thr His Pro Glu Pro Leu Leu His Arg He Gin Ala Gin Phe 
690 695 700 

Arg Glu Glu Met Arg Val Glu Ala Glu Ala Arg Glu Met Arg Glu Ala 
705 710 715 720 

Leu Asp Arg Val Asp Ser Val Ser Gin Arg Ala Gly Pro Leu Thr Val 
20 725 730 735 

Met Pro Val Pro Ala Ala Pro Gly Ala Gly Gly Arg Ala Pro Cys Pro 

740 745 750 

Pro Ala Leu Gly Pro Glu Ala He Gin Ala Arg Leu Glu Asp Val Arg 
755 760 765 

25 He Gin Ala Arg Arg Ala He Glu Ser Ala He Lys Glu Tyr Phe His 
770 775 780 

Arg Gly Ala Val Tyr Ser Ala Lys Ala Leu Gin Ala Ser Asp. Ser His 
785 790 795 800 

Asp Cys Arg Phe His Val Ala Ser Ala Ala Val Val Pro Met Val Gin 
30 805 810 815 

Leu Leu Glu Ser Leu Pro Ala Phe Asp Gin His Thr Arg Asp Val Ala 

820 825 830 

Gin Arg Ala Ala Leu Pro Pro Pro Pro Pro Leu Ala Thr Ser Pro Gin 
835 840 845 

35 Ala He Leu Leu Arg Asp Leu Leu Gin Arg Gly Gin Thr Leu Asp Ala 
850 855 860 

Pro Glu Asp Leu Ala Ala Trp Leu Ser Val Leu Thr Asp Ala Ala Thr 
865 870 875 880 

Gin Gly Leu He Glu Arg Lys Pro Leu Glu Glu Leu Ala Arg Ser He 
40 885 890 895 

His Gly He Asn Asp Gin Gin Ala Arg Arg Ser Ser Gly Leu Ala Glu 

900 905 910 

Leu Gin Arg Phe Asp Ala Leu Asp Ala Ala Gin Gin Leu Asp Ser Asp 
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915 920 925 

Ala Ala Phe Val Pro Ala Thr Gly Pro Ala Pro Tyr Val Asp Gly Gly 

930 935 - 940 

Gly Leu Ser Pro Glu Ala Thr Arg Met Ala Glu Asp Ala Leu Arg Gin 
5 945 950 955 960 

Ala Arg Ala Met Glu Ala Ala Lys Met Thr Ala Glu Leu Ala Pro Glu 

965 970 975 

Ala Arg Ser Arg Leu Arg Glu Arg Ala His Ala Leu Glu Ala Met Leu 
980 985 990 

10 Asn Asp Ala Arg Glu Arg Ala Lys Val Ala His Asp Ala Arg Glu Lys 
995 1000 1005 

Phe Leu His Lys Leu Gin Gly Val Leu Arg Pro Leu Pro Asp Phe Val 

1010 1015 1020 

Gly Leu Lys Ala Cys Pro Ala Val Leu Ala Thr Leu Arg Ala Ser Leu 
15 1025 1030 1035 104 

Pro Ala Gly Trp Thr Asp Leu Ala Asp Ala Val Arg Gly Pro Pro Pro 

1045 1050 1055 

Glu Val Thr Ala Ala Leu Arg Ala Asp Leu Trp Gly Leu Leu Gly Gin 
1060 1065 1070 

20 Tyr Arg Glu Ala Leu Glu His Pro Thr Pro Asp Thr Ala Thr Ala Gly 
1075 1080 1085 

Leu His Pro Ala Phe Val Val Val Leu Lys Thr Leu Phe Ala Asp Ala 

1090 1095 1100 

Pro Glu Thr Pro Val Leu Val Gin Phe Phe Ser Asp His Ala Pro Thr 
25 1105 1110 1115 112 

lie Ala Lys Ala Val Ser Asn Ala lie Asn Ala Gly Ser Ala Ala Val 

1125 1130 1135 

Ala Thr Asp Ala Ala Thr Val Asp Ala Ala Val Arg Ala His Gly Ala 
1140 1145 1150 

30 Asp Ala Val Ser Ala Leu Gly Ala Ala Ala Arg Asp Pro Asp Leu Ser 
1155 1160 1165 

Phe Leu Ala Ala Asp Ser Ala Ala Gly Tyr Val Lys Ala Thr Arg Leu 

1170 1175 1180 

Ala Leu Glu Arg Ala lie Asp Glu Leu Thr Thr Leu Gly Ser Ala Ala 
35 1185 1190 1195 120 

Ala Asp Leu Val Val Gin Ala Arg Arg Ala Cys Ala Gin Pro Glu Gly 

1205 1210 1215 

Asp His Ala Ala Leu lie Asp Ala Ala Ala Arg Ala Thr Thr Ala Ala 
1220 1225 1230 

40 Arg Glu Ser Leu Ala Gly His Glu Ala Gly Phe Gly Gly Leu Leu His 
1235 1240 1245 

Ala Glu Gly Thr Ala Gly Asp His Ser Pro Ser Gly Arg Ala Leu Gin 
1250 1255 1260 
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Glu Leu Gly Lys Val He Gly Ala Thr Arg Arg Arg Ala Asp Glu Leu 
1265 1270 1275 128 

Glu Ala Ala Val Ala Asp Leu Thr Ala Lys Met Ala Ala Gin Arg Arg 
1285 1290 1295 

5 Ser Ser Trp Ala Ala Gly Val Glu Ala Ala Leu Asp Arg Val Glu Asn 
1300 1305 1310 

Arg Ala Glu Phe Asp Val Val Glu Leu. Arg Arg Leu Gin Ala Gly Thr 

1315 1320 1325 

His Gly Tyr Asn Pro Arg Asp Phe Arg Lys Arg Ala Glu Gin Ala Ala 
10 1330 1335 1340 

Asn Ala Glu Ala Val Thr Leu Ala Leu Asp Thr Ala Phe Ala Phe Asn 
1345 1350 1355 136 

Pro Tyr Thr Pro Glu Asn Gin Arg His Pro Met Leu Pro Pro Leu Ala 
1365 1370 1375 

15 Ala He His Arg Leu Gly Trp Ser Ala Ala Phe His Ala Ala Ala Glu 
1380 1385 1390 

Thr Tyr Ala Asp Met Phe Arg Val Asp Ala Glu Pro Leu Ala Arg Leu 

1395 1400 1405 

Leu Arg He Ala Glu Gly Leu Leu Glu Met Ala Gin Ala Gly Asp Gly 
20 1410 1415 1420 

Phe He Asp Tyr His Glu Ala Val Gly Arg Leu Ala Asp Asp Met Thr 
1425 1430 1435 144 

Ser Val Pro Gly Leu Arg Arg Tyr Val Pro Phe Phe Gin His Gly Tyr 
1445 1450 1455 

25 Ala Asp Tyr Val Glu Leu Arg Asp Arg Leu Asp Ala He Arg Ala Asp 
1460 1465 1470 

Val His Arg Ala Leu Gly Gly Val Pro Leu Asp Leu Ala Ala Ala Ala 

1475 1480 1485 

Glu Gin He Ser Ala Ala Arg Asn Asp Pro Glu Ala Thr Ala Glu Leu 
30 1490 1495 1500 

Val Arg Thr Gly Val Thr Leu Pro Cys Pro Ser Glu Asp Ala Leu Val 
1505 1510 1515 152 

Ala Cys Ala Ala Ala Leu Glu Arg Val Asp Gin Ser Pro Val Lys Asn 
1525 1530 1535 

35 Thr Ala Tyr Ala Glu Tyr Val Ala Phe Val Thr Arg Gin Asp Thr Ala 
1540 1545 1550 

Glu Thr Lys Asp Ala Val Val Arg Ala Lys Gin Gin Arg Ala Glu Ala 

1555 1560 1565 

Thr Glu Arg Val Met Ala Gly Leu Arg Glu Ala Ala Arg Glu Arg Arg 
40. 1570 1575 1580 

Ala Gin He Glu Ala Glu Gly Leu Ala Asn Leu Lys Thr Met Leu Lys 
1585 1590 1595 160 

Val Val Ala Val Pro Ala Thr Val Ala Lys Thr Leu Asp Gin Ala Arg 
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1605 



1610 



1615 



Ser Val Ala Glu lie Ala Asp Gin Val Glu Val Leu Leu Asp Gin Thr 

1620 - 1625 1630 

Glu Lys Thr Arg Glu Leu Asp Val Pro Ala Val lie Trp Leu Glu His 

1635 1640 1645 

Ala Gin Arg Thr Phe Glu Thr His Pro Leu Ser Ala Arg Asp Gly Pro 

1650 1655 1660 

Gly Pro Leu Ala Arg His Ala Gly Arg Leu Gly Ala Leu Phe Asp Thr 



10 Arg Arg Arg Val Asp Ala Leu Arg Arg Ser Leu Glu Glu Ala Glu Ala 

1685 1690 1695 

Glu Trp Asp Glu Val Trp Gly Arg Phe Gly Arg Val Arg Gly Gly Ala 

1700 1705 1710 

Trp Lys Ser Pro Glu Gly Phe Arg Ala Met His Glu Gin Leu Arg Ala 
15 . 1715 1720 1725 

Leu Gin Asp Thr Thr Asn Thr Val Ser Gly Leu Arg Ala Gin Pro Ala 

1730 1735 1740 

Tyr Glu Arg Leu Ser Ala Arg Tyr Gin Gly Val Leu Gly Ala Lys Gly 
1745 1750 1755 176 

20 Ala Glu Arg Ala Glu Ala Val Glu Glu Leu Gly Ala Arg Val Thr Lys 

1765 1770 1775 

His Thr Ala Leu Cys Ala Arg Leu Arg Asp Glu Val Val Arg Arg Val 

1780 1785 1790 

Pro Trp Glu Met Asn Phe Asp Ala Leu Gly Arg Leu Leu Ala Glu Phe 
25 1795 1800 1805 

Asp Ala Ala Ala Ala Asp Leu Ala Pro Trp Ala Val Glu Glu Phe Arg 

1810 1815 1820 

Gly Ala Arg Glu Leu He Gin Tyr Arg Met Gly Ser Ala Tyr Ala Arg 
1825 1830 1835 184 

30 Ala Gly Gly Gin Thr Xaa Xaa Xaa Xaa Xaa 



1665 



1670 



1675 



168 



1845 



1850 



(2) INFORMATION FOR SEQ ID 



NO: 121: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:121: 
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Met Ser Asp Ser Ala Leu Gin Val Pro Ala Pro Ala Gly Met Thr Pro 

1 5-10 15 

Pro Ser Ala Pro Pro Pro Asn Gly Pro Leu Gin Val Leu Leu Gly Ser 
5 20 25 30 

Leu Thr Asn Leu Arg Arg Pro Pro Ser Pro Ser Ser Glu Pro Ala Gly 

35 40 45 

Ser Ala Asp Glu Pro Ala Phe Leu Ser Ala Ala Lys Leu Arg Ala Ala 
50 55 60 

10 Thr Ala Ala Phe Leu Leu Ser Gly Ala Ala Val Gly Pro Ala Glu Ala 
65 70 75 80 

Arg Ala Cys Trp His Pro Leu Leu Glu Gin Leu Cys Ala Leu His Arg 

85 90 95 

Ala His Gly Leu Pro Glu Thr Ala Leu Leu Ala Glu Asn Leu Pro Gly 
15 100 105 110 

Leu Leu Val His Arg Met Ala Val Pro Glu Thr Pro Glu Ala Ala Phe 

115 120 125 

Arg Glu Met Asp Val lie Lys Asp Thr Val Leu Ala lie Thr Gly Ser 
130 135 140 

20 Asp Thr Thr His Ala Leu Glu Ala Ala Gly Leu Arg Thr Thr Ala Ala 
145 150 155 160 

Leu Gly Pro Val Arg Val Arg Gin Cys Ala Val Glu Trp lie Asp Arg 

165 170 175 

Trp Arg Thr Val Thr Gin Ser Cys Leu Ala Met Asn Pro Arg Thr Ser 
25 180 185 190 

Leu Glu Ala Leu Gly Glu Met Ser Leu Lys Met Ser Pro Val Pro Leu 

195 200 205 

Gly Gin Pro Gly Ala Asn Leu Thr Thr Pro Ala Tyr Ser Leu Leu Phe 
210 215 220 

30 Pro Ser Pro He Val Gin Glu Gly Leu Arg Phe Leu Ala Leu Val Ser 
225 230 235 240 

Asn Trp Val Thr Leu Phe Ser Ala His Leu Gin Arg He Asp Asp Ala 

245 250 255 

Ala Leu Thr Pro Leu Thr Arg Ala Leu Phe Thr Leu Ala Leu Val Asp 
35 260 265 270 

Glu Tyr Leu Thr Thr Pro Asp Arg Gly Ala Val Val Pro Pro Pro Leu 

275 280 285 

Leu Ala Gin Phe Gin His Thr Val Arg Glu He Asp Pro Ala He Met 
290 295 300 

40 He Pro Pro Leu Glu Ala Thr Lys Met Val Arg Ser Arg Glu Glu Val 
305 310 315 320 

Arg Val Ser Thr Ala Leu Ser Arg Val Ser Pro Arg Ser Ala Cys Ala 
325 330 335 
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Pro Pro Gly Thr Leu Met Ala Arg Val Arg Thr Asp Ala Ala Val Phe 

340 345 350 

Asp Pro Asp Val Pro Phe Leu Ser Ala Ser Ala lie Phe Arg Pro Ala 
355 360 365 

5 Val Thr Gly. Leu Leu Gin Leu Gly Glu Pro Pro Ser Ala Gly Ala Gin 
370 375 380 

Gin Arg Leu Leu Ala Leu Leu Gin Gin Thr Trp Ala Leu Val Gin Asn 
385 390 395 400 

Ser Asn Ser Pro Ser Val Val lie Asn Thr Leu Thr Asp Ala Gly Phe 
10 405 410 415 

Thr Pro Ala His Cys Thr Gin Tyr lie Ser Ala Leu Glu Gly Phe Leu 

420 425 430 

Val Ala Gly Val Pro Ala Arg Thr Pro Pro Gly His Gly Leu Ser Glu 
435 440 445 

15 lie Gin Gin Leu Phe Gly Cys lie Ala Gly Ala Asn Val Phe Gly Leu 
450 455 460 

Ala Arg Glu Tyr Gly His Tyr Ala Gly Tyr Val Lys Thr Phe Arg Arg 
465 470 475 480 

He Gin Gly Ala Ser Glu His Thr His Gly Arg Leu Cys Glu Ala Val 
20 485 490 495 

Gly Leu Ser Gly Gly Val Leu Ser Gin Thr Leu Ala Arg He Met Gly 

500 505 510 

Pro Ala Val Pro Thr Glu His Leu Ala Ser Leu Arg Arg Thr Leu Val 
515 520 525 

25 Gly Glu Phe Glu Thr Ala Glu Arg Arg Phe Ser Ala Gly Gin Pro Ser 
530 535 540 

Leu Leu Arg Glu Thr Ala Leu He Trp Leu Asp Val Tyr Gly Gin Thr 
545 550 555 560 

His Trp Asp Leu Thr Pro Thr Thr Pro Ala Thr Pro Leu Ser Ala Leu 
30 565 570 575 

Leu Pro Val Gly Pro Pro Ser His Ala Pro Ser Val His Leu Ala Ala 

580 585 590 

Ala Thr Lys He Arg Phe Pro Ala Leu Glu Gly He His Pro Asn Val 
595 600 605 

35 Leu Ala Asp Pro Gly Phe Val Pro Tyr Val Leu Ala Leu Val Val Gly 
610 615 620 

Asp Ala Leu Arg Ala Thr Cys Asn Ala Ala Tyr Leu Pro Arg Pro He 
625 630 635 640 

Glu Phe Ala Leu Arg Val Leu Ala Trp Ala Arg Asp Phe Gly Leu Gly 
40 645 650 655 

Tyr Leu Pro Thr Val Glu Gly His Arg Thr Lys Leu Gly Ala Leu He 

660 665 670 

Thr Leu Leu Glu Pro Ala Thr Arg Ala Gly Val Gly Pro Thr Met Gin 
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675 680 685 

Met Ala Asp Asn lie Glu Gin Leu Leu Arg Glu Leu Tyr Val lie Arg 

690 695 - 700 

Ala Val Glu Gin Leu Arg Pro Ala Val Gin Leu Pro Pro Pro Gin Pro 
5 705 710 715 720 

Pro Glu Val Gly Ser Ser Leu Leu Leu lie Ser Met Tyr Ala Arg Val 

725 730 735 

Leu Gin Glu Leu Ala Glu Arg Ala Asp Pro Leu Val Arg Gin Leu Glu 
740 745 750 

10 Asp Ala lie Val Leu Leu Arg Leu His Met Arg Thr Leu Ala Ala Phe 
755 760 765 

Phe Glu Cys Arg Phe Glu Ser Asp Gly His Arg Leu Tyr Ala Val Val 

770 775 780 

Ala Asp Ala His Glu Arg Leu Gly Pro Trp Arg Pro Glu Ala Met Gly 
15 785 790 795 800 

Asp Ala Val Ser Gin Tyr Cys Gly Met Tyr His Asp Ala Lys Arg Ala 

805 810 815 

Leu Val Ala Ser Leu Ala Gly Leu Arg Ser Val Val Thr Glu Thr Thr 
820 825 830 

20 Ala His Leu Gly Val Cys Asp Glu Leu Ala Ala Gin Val Ser His Glu 
835 840 845 

Gly Asn Val Leu Ala Val Val Arg Arg Glu lie His Gly Phe Leu Ala 

850 855 860 

He Val Ser Gly He His Ala Arg Ala Ser Lys Leu Met Ser Gly Asp 
25 865 870 875 880 

Gin Val Pro Gly Phe Cys Tyr Met Ser Gin Phe Leu Ala Arg Trp Arg 

885 890 895 

Arg Leu Ser Ala Gly Tyr Gin Ala Ala Arg Ala Ala Thr Gly Pro Glu 
900 905 910 

30 Arg Val Ala Glu Phe' Val Gin Glu Leu His Asp Thr Trp Lys Gly Leu 
915 920 925 

Gin Thr Glu Arg Ala Leu Val Val Ala Pro Phe Ala Ser Ser Ala Asp 

930 935 940 

Gin Arg Thr Ala Ala He Gin Glu Val Met Ala His Ala Thr Glu Asp 
35 945 950 955 960 

Ala Pro Pro Ser Pro Ala Ala Asp Leu Val Val Leu Thr Asn Arg His 

965 970 975 

Asp Leu Gly Ala Trp Gly Asp Tyr Ser Leu Gly Pro Leu Gly Gin Pro 
980 985 990 

40 Thr Val Val Pro Asp Ser Val Asp Leu Ser Pro Gin Gly Leu Ala Ala 
995 1000 1005 

Thr Leu Ser Met Asp Trp Leu Leu He Asn Glu Leu Leu Gin Val Thr 
1010 1015 1020 
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Asp Gly Val Phe Arg Ala Ser Ala Phe Arg Pro Ser Ala Gly Pro Glu 
1025 1030 1035 104 

Ala Pro Gly Asp Leu Glu Ala Gin Asp Ala Gly Gly Ser Thr Pro Glu 
1045 1050 1055 

5 Pro Thr Thr Pro Gly Pro Gin Asp Thr Gin Ala Arg Ala Pro Ser Trp 
1060 1065 1070 

Ala Gly Arg Glu Thr Val Pro Trp Pro Asn Thr Pro Val Glu Asp Asp 

1075 . 1080 1085 

Glu Met Thr Pro Gin Glu Thr Pro Pro Val His Pro 
10 1090 1095 1100 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 641 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Val Glu Arg Thr Gly Gly Ser Cys Arg Arg Ala Pro Gly Pro Gly Ala 
25 1 5 10 15 

Arg Cys Pro Thr Trp Arg Pro Ala Cys Ala Leu Gly Asp Ala Ala Arg 

20 - 25 30 

Arg Pro Arg Ala Gin Thr Gly Met Thr Ala Ala Ala Leu Tyr Gly Gly 
35 40 45 

30 Ala Lys Tyr Arg Pro Gly Thr Leu Arg Asn Pro Gly Arg Val Ala Ser 
50 55 60 

Thr Pro Arg Arg Arg Gly Val Leu Tyr Gly Ala Leu Cys Pro Gly lie 
65 70 75 80 

Pro Phe Val Gly Ser Gly Pro Gly Ala Val Gly Trp Glu Cys Val Cys 
35 85 90 95 

Val Gly Gly Gly Arg Arg Asp Gly Gly Pro Asp Gin Val Tyr Arg Gly 

100 105 110 

Arg Ser Val Gly Arg Pro Asn Arg Pro Phe Lys His Leu Arg Met His 
115 120 125 

40 Arg Pro Ser Gin Ser Asp Thr Gly Thr His Gin Arg Arg Lys Pro Pro 
130 135 140 

Ser Pro Val Arg Val Arg Val Phe Ser Gly Gly Val Phe Phe Leu Ser 
145 150 155 160 
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Ala Leu Leu Pro Pro His Leu His His Pro Pro Pro Thr Trp Leu Ala 

165 170 175 

lie Gly Gly Lys Thr Met Lys Thr Lys Pro Leu Pro Thr Ala Pro Met 
180 185 190 

5 Ala Trp Ala Glu Ser Ala Val Glu Thr Thr Thr Ser Pro Arg Glu Leu 
195 200 205 

Ala Gly His Ala Pro Leu Arg Arg Val Leu Arg Pro Pro lie Ala Arg 

210 215 220 

Arg Asp Gly Pro Val Leu Leu Gly Asp Arg Ala Pro Arg Arg Thr Ala 
10 225 230 235 240 

Ser Thr Met Trp Leu Leu Gly lie Asp Pro Ala Glu Ser Ser Pro Gly 

245 250 255 

Thr Arg Ala Thr Arg Asp Asp Thr Glu Gin Ala Val Asp Lys lie Leu 
260 265 270 

15 Arg Gly Ala Arg Arg Ala Gly Gly Leu Thr Val Pro Gly Ala Pro Arg 
275 .280 285 
Tyr His Leu Thr Arg Gin Val Thr Leu Thr Asp Leu Cys Gin Pro Asn 

290 295 300 

Ala Glu Arg Ala Gly Ala Leu Leu Leu Ala Leu Arg His Pro Thr Asp 
20 305 310 315 320 

Leu Pro His Leu Ala Arg His Arg Ala Pro Pro Gly Arg Gin Thr Glu 

325 330 335 

Arg Leu Ala Glu Ala Trp Gly Gin Leu Leu Glu Ala Ser Ala Leu Gly 
340 345 350 

25 Ser Gly Arg Ala Glu Ser Gly Cys Ala Arg Ala Gly Leu Val Ser Phe 
355 360 365 

Asn Phe Leu Val Ala Ala Cys Ala Ala Ala Tyr Asp Ala Arg Asp Ala 

370 375 380 

Ala Glu Ala Val Arg Ala His lie Thr Thr Asn Tyr Gly Gly Thr Arg 
30 385 390 395 400 

Ala Gly Ala Arg Leu Asp Arg Phe Ser Glu Cys Leu Arg Ala Met Val 

405 410 415 

His Thr His Val Phe Phe Val Met Arg Phe Phe Gly Gly Leu Val Ser 
420 425 430 

35 Trp Val Thr Gin Asp Glu Leu Ala Ser Val Thr Ala Val Cys Ser Gly 
435 440 445 

Pro Gin Glu Ala Thr His Thr Gly His Pro Gly Arg Pro Cys Ser Ala 

450 455 460 

Val Thr lie Pro Ala Cys Ala Phe Val Asp Leu Asp Ala Glu Leu Cys 
40 465 470 475 480 

Leu Gly Gly Pro Gly Ala Ala Phe Leu Tyr Leu Val Phe Tyr Gin Cys 

485 ' 490 495 

Arg Asp Gin Glu Leu Cys Cys Val Tyr Val Val Lys Ser Gin Leu Pro 
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500 505 510 

Pro Arg Gly Leu Glu Ala Ala Leu Glu Arg Leu Phe Gly Arg Leu Arg 

515 520 525 

lie Thr Asn Thr He His Gly Ala Glu Asp Met Thr Pro Pro Pro Pro 
5 530 535 540 

Asn Arg Asn Val Asp Phe Pro Leu Ala Val Leu Ala Ala Ser Ser Gin 
545 550 555 560 

Ser Pro Arg Cys Ser Ala Ser Gin Val Thr Asn Pro Gin Phe Val Asp 
565 570 575 

10 Arg Leu Tyr Arg Trp Gin Pro Asp Leu Arg Gly Arg Pro Thr Ala Arg 
580 585 590 

Thr Cys Thr Tyr Ala Ala Phe Ala Glu Leu Gly Val Met Pro Asp Asn 

595 600 605 

Ser Pro Arg Cys Leu His Arg Thr Glu Arg Phe Gly Ala Val Gly Val 
15 610 615 620 

Pro Val Val He Gly Val Val Trp Arg Pro Gly Gly Trp Arg Ala Cys 
625 630 635 640 

Ala 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 123: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1160 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



Val He Arg Arg Pro Val Arg Pro Phe Gly Arg Thr Ala His Pro Ala 
1 5 10 15 

35 Ser His Gly Pro Ala Ala Val Ser Val His Arg Val Arg Ala Thr Val 
20 25 30 

Thr Leu Val Pro Met Ala Asn Arg Pro Ala Ala Ser Ala Gly Ala Arg 

35 40 45 

Ser Pro Ser Gin Glu Pro Arg Glu Pro Glu Val Ala Pro Pro Gly Gly 
40 50 55 60 

Asp His Val Phe Cys Arg Lys Val Ser Gly Val Met Val Leu Ser Ser 
65 70 75 80 

Asp Pro Pro Gly Pro Ala Ala Tyr Arg He Ser Asp Ser Ser Phe Val 
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85 90 95 

Gin Cys Gly Ser Asn Cys Ser Met lie lie Asp Gly Asp Val Arg His 

100 - 105 110 

Leu Arg Asp Leu Glu Gly Ala Thr Ser Thr Gly Ala Phe Val Ala lie 
5 115 120 125 

Ser Asn Val Ala Ala Gly Gly Asp Gly Arg Thr Ala Val Val Gly Gly 

130 135 140 

Thr Ser Gly Pro Ser Ala Thr Thr Ser Val Gly Thr Gin Thr Ser Gly 
145 150 155 160 

10 Glu Phe Leu His Gly Asn Pro Arg Thr Pro Glu Pro Gin Gly Pro Gin 

165 170 175 

Ala Val Pro Pro Pro Pro Pro Pro Pro Phe Pro Trp Gly His Glu Cys 

180 185 190 

Cys Ala Arg Arg Asp Arg Gly Ala Glu Lys Asp Val Gly Ala Ala Glu 
15 195 200 205 

Ser Trp Ser Asp Gly Pro Ser Ser Asp Ser Glu Thr Glu Asp Ser Asp 

210 215 220 

Ser Ser Asp Glu Asp Thr Gly Ser Gly Ser Glu Thr Leu Ser Arg Ser 
225 230 235 240 

20 Ser Ser lie Trp Ala Ala Gly Ala Thr Asp Asp Asp Asp Ser Asp Ser 

245 250 255 

Asp Ser Arg Ser Asp Asp Ser Val Gin Pro Asp Val Val Val Arg Arg 

260 265 270 

Arg Trp Ser Asp Gly Pro Ala Pro Val Ala Phe Pro Lys Pro Arg Arg 
25 275 280 285 

Pro Gly Asp Ser Pro Gly Asn Pro Gly Leu Gly Ala Gly Thr Gly Pro 

290 295 300 

Gly Ser Ala Thr Asp Pro Arg Ala Ser Ala Asp Ser Asp Ser Ala Ala 
305 310 315 320 

30 His Ala Ala Ala Pro Gin Ala Asp Val Ala Pro Val Leu Asp Ser Gin 

325 330 335 

Pro Thr Val Gly Thr Asp Pro Gly Tyr Pro Val Pro Leu Glu Leu Thr 

340 345 350 

Pro Glu Asn Ala Glu Ala Val Ala Arg Phe Leu Gly Asp Ala Val Asp 
35 355 360 365 

Arg Glu Pro Ala Leu Met Leu Glu Tyr Phe Cys Arg Cys Ala Arg Glu 

370 375 380 

Glu Ser Lys Arg Val Pro Pro Arg Thr Phe Gly Ser Ala Pro Arg Leu 
385 390 395 400 

40 Thr Glu Asp Asp Phe Gly Leu Leu Asn Tyr Ala Glu Met Arg Arg Leu 

405 410 415 

Cys Leu Asp Leu Pro Pro Val Pro Pro Asn Ala Tyr Thr Pro Tyr His 
420 425 430 
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Leu Arg Glu Tyr Ala Thr Arg Leu Val Asn Gly Phe Lys Pro Leu Val 

435 440 445 

Arg Arg Ser Ala Arg Leu Tyr Arg lie Leu Gly lie Leu Val His Leu 
450 455 460 

5 Arg lie Arg Thr Arg Glu Ala Ser Phe Glu Glu Trp Met Arg Ser Lys 
465 470 475 480 

Glu Val Asp Leu Asp Phe Gly Leu Thr Glu Arg Leu Arg Glu His Glu 

485 490 495 

Ala Gin Leu Met lie Leu Ala Gin Ala Leu Asn Pro Tyr Asp Cys Leu 
10 500 505 510 

lie His Ser Thr Pro Asn Thr Leu Val Glu Arg Gly Leu Gin Ser Ala 

515 520 525 

Leu Lys Tyr Glu Glu Phe Tyr Leu Lys Arg Phe Gly Gly His Tyr Met 
530 535 540 

15 Glu Ser Val Phe Gin Met Tyr Thr Arg lie Ala Gly Phe Leu Ala Cys 
545 550 555 560 

Arg Ala Thr Arg Gly Met Arg His lie Ala Leu Gly Arg Gin Gly Ser 

565 570 575 

Trp Trp Glu Met Phe Lys Phe Phe Phe His Arg Leu Tyr Asp His Gin 
20 580 585 590 

He Val Pro Ser Thr Pro Ala Met Leu Asn Leu Gly Thr Arg Asn Tyr 

595 600 605 

Tyr Thr Ser Ser Cys Tyr Leu Val Asn Pro Gin Ala Thr Thr Asn Gin 
610 615 620 

25 Ala Thr Leu Arg Ala He Thr Gly Asn Val Ser Ala He Leu Ala Arg 
625 630 635 640 

Asn Gly Gly He Gly Leu Cys Met Gin Ala Phe Asn Asp Asp Gly Thr 

645 650 655 

Ala Ser He Met Pro Ala Leu Lys Val Leu Asp Ser Leu Val Ala Ala 
30 660 665 670 

His Asn Lys Gin Ser Trp Thr Gly Ala Cys Val Tyr Leu Glu Pro Trp 

675 680 685 

His Ser Asp Val Arg Ala Val Leu Arg Met Lys Gly Val Leu Ala Gly 
690 695 700 

35 Glu Glu Ala Gin Arg Cys Asp Asn He Phe Ser Ala Leu Trp Met Pro 
705 710 715 720 

Asp Leu Phe Phe Lys Arg Leu He Arg His Leu Asp Gly Glu Lys Asn 

725 730 735 

Val Thr Trp Ser Leu Phe Asp Arg Asp Thr Ser Met Ser Leu Ala Asp 
40 740 745 750 

Phe His Gly Glu Glu Phe Glu Lys Leu Tyr Glu His Leu Glu Ala Met 

755 760 765 

Gly Phe Gly Glu Thr lie Pro He Gin Asp Leu Ala Tyr Ala He Val 
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770 775 780 

Arg Ser Ala Ala Thr Thr Gly Ser Pro Phe He Met Phe Lys Asp Ala 
785 790 *■ 795 800 

Val Asn Arg His Tyr He Tyr Asp Thr Gin Gly Ala Ala lie Ala Gly 
5 805 810 815 

Ser Asn Leu Cys Thr Glu He Val His Pro Ser Ser Lys Arg Ser Ser 

820 825 830 

Gly Val Cys Asn Leu Gly Ser Val Asn Leu Ala Arg Cys Val Ser Arg 
835 840 845 

10 Arg Thr Phe Asp Phe Gly Met Leu Arg Asp Ala Val Gin Ala Cys Val 
850 855 860 

Leu Met Val Asn He Met He Asp Ser Thr Leu Gin Pro Thr Pro Gin 
865 870 875 880 

Cys Arg His Asp Asn Leu Arg Ser Met Gly He Gly Met Gin Gly Leu 
15 885 890 895 

His Thr Ala Cys Leu Lys Met Gly Leu Asp Leu Glu Ser Ala Glu Phe 

900 905 910 

Arg Asp Leu Asn Thr His He Ala Glu Val Met Leu Leu Ala Ala Met 
915 920 925 

20 Lys Thr Ser Asn Ala Leu Cys Val Arg Gly Ala Arg Pro Phe Ser His 
930 935 940 

Phe Lys Arg Ser Met Tyr Arg Ala Gly Arg Phe His Trp Glu Arg Phe 
945 950 955 960 

Ser Asn Asp Arg Tyr Glu Gly Glu Trp Glu Met Leu Arg Gin Ser Met 
25 965 970 975 

Met Lys His Gly Leu Arg Asn Ser Gin Phe lie Ala Leu Met Pro Thr 

980 985 • . 990 

Ala Ala Ser Ala Gin He Ser Asp Val Ser Glu Gly Phe Ala Pro Leu 
995 1000 1005 

30 Phe Thr Asn Leu Phe Ser Lys Val Thr Arg Asp Gly Glu Thr Leu Arg 
1010 1015 1020 

Pro Asn Thr Leu Leu Leu Lys Glu Leu Glu Arg Thr Phe Gly Gly Lys 
1025 1030 1035 104 

Arg Leu Leu Asp Ala Met Asp Gly Leu Glu Ala Lys Gin Trp Ser Val 
35 1045 1050 1055 

Ala Gin Ala Leu Pro Cys Leu Asp Pro Ala His Pro Leu Arg Arg Phe 

1060 1065 1070 

Lys Thr Ala Phe Asp Tyr Asp Gin Glu Leu Leu He Asp Leu Cys Ala 
1075 1080 1085 

40 Asp Arg Ala Pro Tyr Val Asp His Ser Gin Ser Met Thr Leu Tyr Val 
1090 1095 1100 

Thr Glu Lys Ala Asp Gly Thr Leu Pro Ala Ser Thr Leu Val Arg Leu 
1105 1110 1115 H2 
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Leu Val His Ala Tyr Lys Arg Gly Leu Lys Thr Gly Met Tyr Tyr Cys 

1125 1130 1135 

Lys Val Arg Lys Ala Thr Asn Ser Gly Val Phe Ala Gly Asp Asp Asn 
1140 1145 1150 

5 He Val Cys Thr Ser Cys Ala Leu 

1155 1160 

(2) INFORMATION FOR SEQ ID NO: 124: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

20 Met Asp Pro Ala Val Ser Pro Ala Ser Thr Asp Pro Leu Asp Thr His 
15 10 15 

Ala Ser Gly Ala Gly Ala Ala Pro He Pro Val Cys Pro Thr Pro Glu 

20 % 25 30 

Arg Tyr Phe Tyr Thr Ser Gin Cys Pro Asp He Asn His Leu Arg Ser 
25 35 40 45 

Leu Ser He Leu Asn Arg Trp Leu Glu Thr Glu Leu Val Phe Val Gly 

50 55 60 

Asp Glu Glu Asp Val Ser Lys Leu Ser Glu Gly Glu Leu Gly Phe Tyr 
65 70 75 80 

30 Arg Phe Leu Phe Ala Phe Leu Ser Ala Ala Asp Asp Leu Val Thr Glu 

85 90 95 

Asn Leu Gly Gly Leu Ser Gly Leu Phe Glu Gin Lys Asp He Leu His 

100 105 110 

Tyr Tyr Val Glu Gin Glu Cys He Glu Val Val His Ser Arg Val Tyr 
35 115 120 125 

Asn He He Gin Leu Val Leu Phe His Asn Asn Asp Gin Ala Arg Arg 

130 135 140 

Ala Tyr Val Ala Arg Thr He Asn His Pro Ala He Arg Val Lys Val 
145 150 155 160 

40 Asp Trp Leu Glu Ala Arg Val Arg Glu Cys Asp Ser He Pro Glu Lys 

165 170 175 

Phe He Leu Met He Leu He Glu Gly Val Phe Phe Ala Ala Ser Phe 
180 185 190 
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Ala Ala He Ala Tyr Leu Arg Thr Asn Asn Leu Leu Arg Val Thr Cys 

195 200 205 

Gin Ser Asn Asp Leu He Ser Arg " Asp Glu Ala Val His Thr Thr Ala 
.210 215 220 

5 Ser Cys Tyr He Tyr Asn Asn Tyr Leu Gly Gly His Ala Lys Pro Glu 
225 230 235 240 

Ala Ala Arg Val Tyr Arg Leu Phe Arg Glu Ala Val Asp He Glu He 

245 250 255 

Gly Phe He Arg Ser Gin Ala Pro Thr Asp Ser Ser He Leu Ser Pro 
10 260 265 270 

Gly Ala Ala He Glu Asn Tyr Val Arg Phe Ser Ala Asp Arg Leu Leu 

275 280 285 

Gly Leu lie His Met Gin Pro Lys Ala Pro Ala Pro Asp Ala Ser Phe 
290 295 300 

15 Pro Leu Ser Leu Met Ser Thr Asp Lys His Thr Asn Phe Phe Glu Cys 
305 310 315 320 

Arg Ser Thr Ser Tyr Ala Gly Ala Val Val Asn Asp Leu 
325 330 

20 (2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Met Arg Arg Arg Gly His Ala Phe Ala Pro Gly Asp Arg Gly Thr Arg 

15 10 15 

Ala Ala Gly Pro Gly Pro Ala Ala Pro Trp Gly Ala Pro Ser Lys Pro 
35 20 25 30 

Ala Leu Arg Leu Ala His Leu Phe Cys He Arg Val Leu Arg Ala Leu 

35 40 45 

Gly Tyr Ala Tyr He Asn Ser Gly Gin Leu Glu Ala Asp Asp Ala Cys 
50 55 60 

40 Ala Asn Leu Tyr His Thr Asn Thr Val Ala Tyr Val His Thr Thr Asp 
65 70 75 80 

Thr Asp Leu Leu Leu Met Gly Cys Asp He Val Leu Asp He Ser Thr 
85 90 95 
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Gly Tyr He Pro Thr He His Cys Arg Asp Leu Leu Gin Tyr- Phe Lys 

100 105 HO 

Met Ser Tyr Pro Gin Phe Leu Ala Leu Phe Val Arg Cys His Thr Asp 
115 120 125 

5 Leu His Pro Asn Asn Thr Tyr Ala Ser Val Glu Asp Val Leu Arg Glu 
130 135 140 

Cys His Trp Thr Ala Pro Ser Arg Ser Gin Ala Arg Arg Ala Ala Arg 
145 150 155 160 

Arg Glu Arg Ala Asn Ser Arg Ser Leu Glu Ser Met Pro Thr Leu Thr 
10 165 170 175 

Ala Ala Pro Val Gly Leu Glu Thr Arg He Ser Trp Thr Glu He Leu 

180 185 190 

Ala Gin Gin He Ala Gly Glu Asp Asp Tyr Glu Glu Asp Pro Pro Leu 
195 200 205 

15 Gin Pro Pro Asp Val Ala Gly Gly Pro Arg Asp Gly Ala Arg Ser Ser 
210 215 220 

Ser Ser Glu He Leu Thr Pro Pro Glu Leu Val Gin Val Pro Asn Ala 
225 230 235 240 

Gin Arg Val Ala Glu His Arg Gly Tyr Val Ala Gly Arg Arg Arg His 
20 245 250 255 

Val He His Asp Ala Pro Glu Ala Leu Asp Trp Leu Pro Asp Pro Met 

260 265 270 

Thr He Ala Glu Leu Val Glu His Arg Tyr Val Lys Tyr Val He Ser 
275 280 285 

25 Leu He Ser Pro Lys Glu Arg Gly Pro Trp Thr Leu Leu Lys Arg Leu 
290 295 300 

Pro lie Tyr Gin Asp Leu Arg Asp Glu Asp Leu Ala Arg Ser He Val 
305 310 315 320 

Thr Arg His He Thr Ala Pro Asp He Ala Asp Arg Phe Leu Ala Gin 
30 325 330 335 

Leu Trp Ala His Ala Pro Pro Pro Ala Phe. Tyr Lys Asp Val Leu Ala 

340 345 350 

Lys Phe Trp Asp Glu 
355 



35 



(2) INFORMATION FOR SEQ ID NO: 126: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 466 amino acids 
40 <B) TYPE: amino acid 

{C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

5 Met Ala His Leu Pro Gly Gly Ala Ala Ala Ala Pro Leu Ser Glu Asp 
15 10 15 

Ala lie Pro Ser Pro Arg Glu Arg Thr Glu Asp Trp Pro Pro Cys Gin 

20 25 30 

lie Val Leu Gin Gly Ala Glu Leu Asn Gly He Leu Gin Ala Phe Ala 
10 35 40 45 

Pro Leu Arg Thr Ser Leu Leu Asp Ser Leu Leu Val Val Gly Asp Arg 

50 55 60 

Gly He Leu Val His Asn Ala He Phe Gly Glu Gin Val Phe Leu Pro 
65 70 75 80 

15 Leu Asp His Ser Gin Phe Ser Arg Tyr Arg Trp Gly Gly Pro Thr Ala 

85 90 95 

Ala Phe Leu Ser Leu Val Asp Gin Lys Arg Ser Leu Leu Ser Val Phe 

100 105 110 

Arg Ala Asn Gin Tyr Pro Asp Leu Arg Arg Val Glu Leu Thr Val Thr 
20 115 120 125 

Gly Gin Ala Pro Phe Arg Thr Leu Val Gin Arg He Trp Thr Thr Ala 

130 135 140 

Ser Asp Gly Glu Ala Val Glu Leu Ala Ser Glu Thr Leu Met Lys Arg 
145 150 155 160 

25 Glu Leu Thr Ser Phe Ala Val Leu Leu Pro Gin Gly Asp Pro Asp Val 

165 170 175 

Gin Leu Arg Leu Thr Lys Pro Gin Leu Thr Lys Val Val Asn Ala Val 

180 185 190 

Gly Asp Glu Thr Ala Lys Pro Thr Thr Phe Glu Leu Gly Pro Asn Gly 
30 195 200 205 

Lys Phe Ser Val Phe Asn Ala Arg Thr Cys Val Thr Phe Ala Ala Arg 

210 215 220 

Glu Glu Gly Ala Ser Ser Ser Thr Ser Ala Gin Val Gin He Leu Thr 
225 230 235 240 

35 Ser Ala Leu Lys Lys Ala Gly Gin Ala Ala Ala Asn Ala Lys Thr Val 

245 250 255 

Tyr Gly Glu Asn Thr Thr Phe Ser Val Val Val Asp Asp Cys Ser Met 

260 265 270 

Arg Ala Val Leu Arg Arg Leu Gin Val Gly Gly Gly Thr Leu Lys Phe 
40 275 280 285 

Phe Leu Thr Ala Asp Val Pro Ser Val Cys Val Thr Ala Thr Gly Pro 

290 295 300 

Asn Ala Val Ser Ala Val Phe Leu Leu Lys Pro Gin Arg Val Cys Leu 
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305 310 315 320 

Asn Trp Leu Gly Arg Thr Pro Gly Ser Ser Thr Gly Ser Leu Ala Ser 

325 - 330 335 

Gin Asp Ser Arg Ala Gly Pro Thr Asp Ser Gin Asp Phe Ser Ser Glu 
5 340 345 350 

Pro Asp Ala Gly Asp Arg Gly Ala Pro Glu Glu Glu Gly Leu Glu Gly 

355 360 365 

Gin Ala Arg Val Pro Pro Ala Phe Pro Glu Pro Pro Gly Thr Lys Arg 
370 375 380 

10 Arg His Ala Gly Ala Glu Val Val Pro Ala Asp Asp Ala Thr Lys Arg 
385 390 395 400 

Pro Lys Thr Gly Val Pro Ala Ala Pro Thr Arg Ala Glu Ser Pro Pro 

405 410 415 

Leu Ser Ala Arg Tyr Gly Pro Glu Ala Ala Glu Gly Gly Gly Asp Gly 
15 420 425 430 

Gly Arg Tyr Ala Cys Tyr Phe Arg Asp Leu Gin Thr Gly Asp Asp Ser 

435 440 445 

Pro Leu Ser Ala Phe Arg Gly Pro Gin Arg Pro Pro Tyr Gly Phe Gly 
450 455 460 

20 Leu Pro 
465 

(2) INFORMATION FOR SEQ ID NO: 127: 



25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

35 Val Cys Pro Pro Pro Pro Thr Asn Met Ala Val Val Cys Gly Ser Gly 
15 10 15 

Leu Arg Leu Arg Pro Phe His Pro Pro Ser Pro Ser Phe Phe Val Leu 

20 25 30 

Arg Ala Leu He Arg Ala Gly Pro Gly Pro Phe Ala Asp Arg Ala Pro 
40 35 40 45 

Ser Gly Pro Gly Cys Gly Met Cys Arg Gly Asp Ser Pro Gly Val Ala 

50 55 60 

Gly Gly Ser Gly Glu His Cys Leu Gly Gly Asp Asp Gly Asp Asp Gly 
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65 70 75 80 

Arg Pro Arg Leu Ala Cys Val Gly Ala lie Arg Phe Ala His Leu Trp 

85 - 90 95 

Leu Gin Ala Thr Thr Leu Gly Phe Val Gly Ser Val Val Leu Ser Arg 
5 100 105 110 

Gly Pro Tyr Ala Asp Ala Met Ser Gly Ala Phe Val lie Gly Ser Thr 

115 . 120 125 

Gly Leu Gly Phe Leu Arg Ala Pro Pro Ala Phe Ala Arg Pro Pro Thr 
130 135 140 

10 Arg Val Cys Ala Trp Leu Arg Leu Val Gly Gly Gly Ala Ala Val Trp 
145 150 155 160 

Ser Leu Gly Glu Ala Gly Ala Pro Pro Gly Val Pro Gly Pro Ala Thr 

165 170 175 

Gin Cys Leu Ala Leu Gly Ala Ala Tyr Ala Ala Leu Leu Val Leu Ala 
15 180 185 190 

Asp Asp Val His Pro Leu Phe Leu Leu Ala Pro Arg Pro Leu Phe Val 

195 200 205 

Gly Thr Leu Gly Val Val Val Gly Gly Leu Thr He Gly Gly Ser Ala 
210 215 220 

20 Arg Tyr Trp Trp He Asp Pro Arg Ala Ala Ala Ala Leu Thr Ala Ala 
225 230 235 240 

Val Val Ala Gly Leu Gly Thr Thr Ala Ala Gly Asp Ser Phe Ser Lys 

245 250 255 

Ala Cys Pro Arg His Arg Arg Phe Cys Val Val Ser Ala Val Glu Ser 
25 260 265 270 

Pro Pro Pro Arg Tyr Ala Pro Glu Asp Ala Glu Arg Pro Thr Asp His 

275 280 285 

Gly Pro Leu Leu Pro Ser Thr His His Gin Arg Ser Pro Arg Val Cys 
290 295 300 

30 Gly Asp Gly Ala Ala Arg Pro Glu Asn He Trp Val Pro Val Val Thr 
305 310 315 320- 

Phe Ala Gly Ala Leu Ala Ala Cys Ala Arg Ser 
325 330 

35 (2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2342 base pairs 

(B) TYPE: nucleic acid 
40 iO STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: 

GGCGGGATCT GCGCACGCGC GGCACGGCGG 
CCGGGGGAGG AAGCGCGGCA TCCGCGGGGG 
CGTCGCGAGG GGCCACGGGC ACGCGCCCCG 
TTTCGCGAGC CGTAGCTGCC GGCCCGATGG 
ACTGATCGGT GGCGGGGGGG GGAAGAAGGG 
CGTCGTCGGA CGCCAGCTCC TCCAGGCCGT 
GCTCGCCGGT GGTGGCGTCG GTGAGGAGAG 
GCGCGGGGGG GGCAGCGGGG TCCTCGGGAC 
GCGGGTCGCG GGCGGCGGTC GGGGCAGAGG 
AAACCGGGTG TCCCGGGAAC AGCTCCCCCG 
GGATGGCCCG CGCGAAGAAG GGGTCCGCGT 
CGGTAGCCAC AAACGGAAGC TCCTCGGTGG 
GGCCGGGGGG CTCCGGGGCT TCCCACAAGA 
GGACCAGGCA CGGGGGCCCG TCGGCGAGAG 
GGAGCTGCGC CGCCAGACAC GCGTTTTCGA 
CGGCCCACGT CTCGATGTCG GACGACACGA 
GGCGCGAGTC GAAGAGCGTC AGGCACAGTT 
TGTTGCGGAG CGCCACCACG ACGGGCGCGC 
TGGCCGTAAC GCGCGCGGCG GGGGTGCGGT 
GGCCCGTGGG TCGGTAGAGG GCGTGGGGGG 
CCGGGCCGAG CGTCTGGCCA GACTCCAGGC 
ACTCGGTGTA GTCGTCGGGA AACATGCAGG 
GACACATGCG CCCGAGGACG CTCACCGCCG 
GCGCCGGGGC GTCCCGGCGC TGGGTCCCGA 
GGTCGGTTTC GGACAGCTTG CCCCGGCGCC 
CCCGGGTCGG GGGGCCTCCG TCCAAAAACG 
GCGGGGTCAG GCGCTGGACG AACAGCATGG 
GGGTGAGGTG CATGTACTCG TGCTGGCGAA 
GTTCCGGAAC GCCGGCCACC AGCGCGACCA 
GGCGCTGGGA CCCCCGGGGG CCCGGCGGCG 
CCAGCTGGGC CCAGCGACAC CCAAACTCGC 
ACAGCTCGGC CGCCGCGTCC ATCGAGGCGC 
GACCGAACAG CTGAAAGTTG GCGGCCTGGG 
CGACGGTCAG CACGTACATG GCCGTGACCG 
GCGGGGGCCG CACGCAGGCC GCCTCGGGAC 
CGGCCGGGAA GCATAGCGCG TACTGCAGCG 
CGAACGGCAG ATCCAGAGCG CTGACGGCCT 
CGGCGCGCAG ATACGCCTCG CCCCGGCGGC 
CGGGGGAAGA AGAGGCCCGG GCGCGGGCGT 
C 



SEQ ID NO: 128: 



CGGAGAAAGC 


GGCGGCAGAG 


CCGGAAAAGG 


60 


GACTCGGTGT 


GGGTGGCGAG 


GGCCGTGGGT 


120 


TGTTTTGTTG 


AGGCGGGACA 


CTCGGTCGTG 


180 


GCCGCGGTGC 


GTACTGGGAC 


GTGGGGACGG 


240 


CCGGGGCCGG 


ATTGGGCGTG 


GGGCCGCCGG 


300 


GGATCCAGGC 


CCACATGCGA 


GGGGGGACGG 


360 


TGGGGGCGAG 


GACCCCCGGG 


TCCGCCTGCC 


420 


CCGATCCGCC 


ATCCCCCCCC 


GCAAGGTCCC 


480 


GACCTGCCTC 


GTCGGCGAGG 


GGGCGCTGGT 


540 


TCAGGAGGGA 


GGCGTCGAAG 


GGCCGCCCGA 


600 


CGGCGGCGCT 


CGCCGCGAGA 


ACGTCCCCCG 


660 


CCTCGCTGCC 


CACAAACCGC 


ACGTCAGGGG 


720 


CCGCGACCGG 


GGTCATGGAG 


ATGTCCACGA 


780 


GGCGCTCGGC 


GATGAGCGCC 


GACAGGCGCG 


840 


TCGGGTTGAG 


ATCGGTGTGG 


AGGAGGCCGA 


900 


CGTCGCGCAG 


GGCGGCGTCC 


GGCCCGCCGG 


960 


CCAGTTCCGA 


CTCGCGGGAG 


AAGGCCGTGG 


1020 


CGAGGAGCAC 


CGCGGCCAGA 


ACCAGGTCCA 


1080 


GGGTCGCGGC 


GGCCAGCACG 


GCCACGTGCT 


1140 


CCTCGGGGAG 


GGACGCCTCG 


CGCCCCCCCG 


1200 


GTGCGGCCAG 


GAGGGCGTCG 


AAGCTGTCGT 


1260 


TCCACAGCGC 


GGCCAAAGCG 


GCGCTCGGCA 


1320 


CCAGGGCCTG 


GGCCGGACTG 


AGCTTCCCGA 


1380 


GCTCCAAGGC 


CGAGCGCCAG 


GGCGCCAGCG 


1440 


AGTCGGCCAG 


CCGCGTGCCG 


AACAGGAGGC 


1500 


TCGGCAACAC 


GCGGATGCGG 


GCGTCGGGAT 


1560 


ACTCCGCTGC 


GTCCTCGAAC 


GCGCGTTCGA 


1620 


CGAGGTCCAG 


GCGCCAGAAG 


TTGTAGATGT 


1680 


GCACGTCGTT 


CTCGTTGAAG 


GCGACGCAGT 


1740 


GACGCGGCGC 


CGCCGCTCCG 


GACGCCCAGC 


1800 


GCGTGAGGGT 


GGTGGCGACG 


AGGGCGACGT 


1860 


CCCACGTCGC 


CTGGCGATGG 


CGCACGAAGC 


1920 


CGTCGCTGAG 


GGCCAGCTGG 


AGCCGGTTCA 


1980 


TCGGGGCCGA 


TTCGAGGACG 


TCCGTCGGAA 


2040 


GCATCAGCAG 


CGCGCCGAGT 


TTGTCGGTGA 


2100 


GCGTTCCGTC 


CGGGGCCAAA 


AAGCTGGTGG 


2160 


CACGCAGCAC 


CAGGGGCCCC 


GGGTCTCCGC 


2220 


GCAGCAGCTG 


CGGGTCGACC 


TCGTGGCCCT 


2280 


CGAGGGCGCG 


AAGATCAACG 


AGCAGGGGCG 


2340 
2342 
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(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 771 amino acids 

5 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Ala Pro Leu Leu Val Asp Leu Arg Ala Leu Asp Ala Arg Ala Arg Ala 
1 5 10 15 

15 Ser Ser Ser Pro Glu Gly His Glu Val Asp Pro Gin lieu Leu Arg Arg 
20 25 30 

Arg Gly Glu Ala Tyr Leu Arg Ala Gly Gly Asp Pro Gly Pro Leu Val 

35 40 45 

Leu Arg Glu Ala Val Ser Ala Leu Asp Leu Pro Phe Ala Thr Ser Phe 
20 50 55 60 

Leu Ala Pro Asp Gly Thr Pro Leu Gin Tyr Ala Leu Cys Phe Pro Ala 
65 70 75 80 

Val Thr Asp Lys Leu Gly Ala Leu Leu Met Arg Pro Glu Ala Ala Cys 
85 90 95 

25 Val Arg Pro Pro Leu Pro Thr Asp Val Leu Glu Ser Ala Pro Thr Val 
100 105 110 

Thr Ala Met Tyr Val Leu Thr Val Val Asn Arg Leu Gin Leu Ala Leu 

115 120 125 

Ser Asp Ala Gin Ala Ala Asn Phe Gin Leu Phe Gly Arg Phe Val Arg 
30 130 135 140 

His Arg Gin Ala Thr Trp Gly Ala Ser Met Asp Ala Ala Ala Glu Leu 
145 150 155 160 

Tyr Val Val Ala Thr Thr Leu Thr Arg Glu Phe Gly Cys Arg Trp Ala 
165 170 175 

35 Gin Leu Gly Trp Ala Ser Gly Ala Ala Ala Pro. Arg Pro Pro Pro Gly 
180 185 190 

Pro Arg Gly Ser Gin Arg His Cys Val Ala Phe Asn Glu Asn Asp Val 

195 200 205 

Leu Val Val Ala Gly Val Pro Glu His He Tyr Asn Phe Trp Arg Leu 
40 210 215 220 

.Asp Leu Val Arg Gin His Glu Tyr Met His Leu Thr Leu Glu Arg Ala 
225 230 235 240 

Phe Glu Asp Ala Ala Glu Ser Met Leu Phe Val Gin Arg Leu Thr Pro 
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245 . 250 255 

His Pro Asp Ala Arg lie Arg Val Leu Pro Thr Phe Leu Asp Gly Gly 

260 - 265 270 

Pro Pro Thr Arg Gly Leu Leu Phe Gly Thr Arg Leu Ala Asp Trp Arg 
5 275 280 285 

Arg Gly Lys Leu Ser Glu Thr Asp Pro Leu Ala Pro Trp Arg Ser Ala 

290 295 300 

Leu Glu Leu Gly Thr Gin Arg Arg Asp Ala Pro Ala Leu Gly Lys Leu 
305 310 315 320 

10 Ser Pro Ala Gin Ala Ala Val Ser Val Leu Gly Arg Met Cys Leu Pro 

325 330 335 

Ser Ala Ala Ala Leu Trp Thr Cys Met Phe Pro Asp Asp Tyr Thr Glu 

340 345 350 

Tyr Asp Ser Phe Asp Ala Leu Leu Ala Ala Arg Leu Glu Ser Gly Gin 
15 355 360 365 

Thr Leu Gly Pro Ala Gly Gly Arg Glu Ala Ser Leu Pro Glu Ala Pro 

. 370 375 380 

His Ala Leu Tyr Arg Pro Thr Gly Gin His Val Ala Val Leu Ala Ala 
385- 390 395 400 

20 Ala Thr Thr Pro Ala Ala Arg Val Thr Ala Met Asp Leu Val Leu Ala 

405 410 415 

Ala Val Leu Leu Gly Ala Pro Val Val Val Arg Asn Thr Thr Ala Phe 

420 425 430 

Ser Arg Glu Ser Glu Leu Glu Leu Cys Leu Thr Leu Phe Asp Ser Arg 
25 435 440 445 

Pro Gly Gly Pro Asp Ala Ala Leu Arg Asp Val Val Ser Ser Asp lie 

450 455 460 

Glu Thr Trp Ala Val Gly Leu Leu His Thr Asp Leu Asn Pro lie Glu 
465 470 475 480 

30 Asn Ala Cys Leu Ala Ala Gin Leu Pro Arg Leu Ser Ala Leu lie Ala 

485 490 495 

Glu Arg Pro Leu Ala Asp Gly Pro Pro Cys Leu Val Leu Val Asp lie 

500 505 510 

Ser Met Thr Pro Val Ala Val Leu Trp Glu Ala Pro Glu Pro Pro Gly 
35 515 520 525 

Pro Pro Asp Val Arg Phe Val Gly Ser Glu Ala Thr Glu Glu Leu Pro 

530 535 540 

Phe Val Ala Thr Ala Gly Asp Val Leu Ala Ala Ser Ala Ala Asp Ala 
545 550 555 560 

40 Asp Pro Phe Phe Ala Arg Ala lie Leu Gly Arg Pro Phe Asp Ala Ser 

565 570 575 

Leu Leu Thr Gly Glu Leu Phe Pro Gly His Pro Val Tyr Gin Arg Pro 
■ 580 585 590 
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Leu Ala Asp Glu Ala Gly Pro Ser Ala Pro Thr Ala Ala Arg Asp Pro 

595 600 605 

Arg Asp Leu Ala Gly Gly Asp Gly Gly Ser Gly Pro Glu Asp Pro Ala 
610 615 620 

5 Ala Pro Pro Ala Arg Gin Ala Asp Pro Gly Val Leu Ala Pro Thr Leu 
625 630 635 640 

Leu Thr Asp Ala Thr Thr Gly Glu Pro Val Pro Pro Arg Met Trp Ala 

645 650 655 

Trp lie His Gly Leu Glu Glu Leu Ala Ser Asp Asp Ala Gly Gly Pro 
10 660 665 670 

Thr Pro Asn Pro Ala Pro Ala Leu Leu Pro Pro Pro Ala Thr Asp Gin 

675 680 685 

Ser Val Pro Thr Ser Gin Tyr Ala Pro Arg Pro lie Gly Pro Ala Ala 
690 695 700 

15 Thr Ala Arg Glu Trp Ser Val Pro Pro Gin Gin Asn Thr Gly Arg Val 
705 710 715 720 

Pro Val Ala Pro Arg Asp Asp Pro Arg Pro Ser Pro Pro Thr Pro Ser 

725 730 735 

Pro Pro Ala Asp Ala Ala Leu Pro Pro Pro Ala Phe Ser Gly Ser Ala 
20 740 745 750 

Ala Ala Phe Ser Ala Ala Val Pro Arg Val Arg Arg Ser Arg Xaa Xaa 

755 760 765 

Xaa Xaa Xaa 
770 



25 



(2) INFORMATION FOR SEQ ID NO: 13 0: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14927 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

GCACGTCAGC GGCGGCCCGC GTCGCGATGT CGCCCCAGCT CTCCGGCCCC TGCGCCCCTG 
GCTCGGGGCC GCGCTCCCCG TCCTCGCTCG CGGGCGTCCC CGCGCCACGC CTCCGCCCCC 
CCTCCTCCGC GGCGGCCCGG GGCTCTTCCT CCTCGGCCCC CCCGGTCGCG CCGCCGGCCC 
40 CCAGCCGCGC CAGCACGCGG CGCAGCGCCT CCTCGTCGCA CTGCTCGGGG CTGACGAGCC 
GCCGCAGCAG CGGCGTCGTC AGGTGGTGGT CGTAGCACGC GCGTATCAGC GCCTCGATCT 
GATCGTCGGG CGACGTCGCC TGGCCGCCGA TGATCAGGGC GTCCACCATG TCCAGCGCCG 
CCAGGTGGCC CCCGAACGCG CGATCGAAGT GCTCCGCCCG CCGCCCGAAC AGCGCCAGCT 
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CCACGGCCAC 


CGCGGCGGTC 


TCCTGCTGCA 


GCTCGCGCTG 


CGCCAGCGCG 


TTCAGGTTGT 


480 


CGGCGAAGGC 


GTCCATGGTG 


GAGTGGCGGG 


CGCGATCGCC 


GGACGCCAGC 


CAGAAGCGCA 


540 


GCTCGCTGAT 


GGCGTACAGG 


CCGGGCGTAG 


TGGCCTGAAA 


CACGTCATGC 


GCCTCCAGCA 


600 


GGGCGTCGGC 


CTCCTCGCGG 


ACAGAAGAGC 


TATCGGCGGG 


CGGCGGGCCG 


GCCCGGGCCC 


660 


CGCCGCCCGC 


CGCGGTCCGC 


GCCAGCGCCT 


GGTCCAGCAC 


ACAGAGCGCT 


CGCGCGCGGG 


720 


CGGCGTCCGA 


CAGCCCGGCG 


GCGTGGGGCA 


GGTACCGTCG 


CAGCTCGTTG 


GCGTCCAGCC 


780 


GCACCTGGGC 


CTGTTGGGTG 


ACGTGGTTAC 


AGATGCGGTC 


CGCCAGGCGG 


CGGGCGATGG 


840 


TCGCCCCCTG 


GTTCGCGGTG 


ACGCACAGCT 


CCTCGAAACA 


GACCGCGCAC 


GGGTGGGACG 


900 


GGTCGCTCAG 


CTCCGGGGGC 


ACGATGAGGC 


CCGACCCCAC 


CGCCGCCACC 


ATAAACTCCC 


960 


GGACGCGCTC 


CAGCGCGGCC 


GTGGCGCCGC 


TCGGGGGGGT 


GATGAGGTGG 


CAGTAGTTCA 


1020 


GCTGCTTGAG 


AAAATTCTCG 


ACATCATGCA 


GGAAGCACAG 


CTCCATGCGG 


ACGTCCCCGC 


1080 


CGTACGTCTC 


CAGCCGGATC 


TGCTGGTGGT 


ACGGACAGGG 


TCGGGCCAGA 


CCCATGGTCT 


1140 


CGGTGAAAAA 


GGCAGAGACG 


TCACCCGTGG 


TCGCGAACGT 


TTCCAGGTGG 


CCCAGGAGCC 


1200 


GCTCCCCCTC 


GCGCCACGCG 


TACTCCAGGA 


GCAACTCCAG 


GGTGACCGAC 


AGCGGGGTGA 


1260 


GAAAGGCGGC 


GGCCTGAGCC 


TCCAGCCCCG 


GCCGCAGGTG 


CCGCCGCAGC 


ACGCGCACCT 


1320 


GGAGCGCGTT 


GAGCTTTAGC 


TGGGCGAGCT 


TCCCCAGGCC 


GATCTGGGGG 


TCGCATCGTC 


13 80 


GAAGCAGCTC 


TAGCTGAAAA 


ACGTACGTCT 


GTACCTGCCC 


GAGCAGGGCC 


AACAGTTTCT 


1440 


GTCGGGCCGC 


AGTGGGCTCG 


GAAACCGCGG 


CCGGGGGCGC 


GGCCGCCATG 


GCGAGTCGCC 


1500 


CGGCCGTGCT 


GTGGTTTAGT 


TAAGGTTTGG 


GGGGGTGGGT 


CAGAGGCGCG 


CCCCGCGCGG 


1560 


ACTGATGCGG 


CGGCGGGCCC 


CTGACATCCC 


CTCTTTATGC 


CCGTCGCCCG 


CCCGCCCGCC 


1620 


CCGCCGGTGT 


GCCGTGATTC 


GCGGAGTCGG 


GGCCTTGTGT 


TTCTTTCTTT 


CCCCCCCCGA 


1680 


ATCCGTTCTT 


TCTTCCTCAC 


CCCCCCTCCC 


CACACACCCA 


CCCAGGACTC 


GCCACCACAA 


1740 


GGAGGCGAGA 


GCCCGTCGCT 


AACCCAAAGA 


CACAGTCACG 


AGACACGATA 


TCGACTGTAG 


1800 


TTGCGATCGT 


TTATTTTATA 


CACAACACCA 


ACCTTTCCTT 


CGACCCCCCC 


CACCCCCGCC 


1860 


CCTAGAGCAT 


ATCCAACGTC 


AGGTCCTTTT 


TCTCCGGTGG 


TCCCTCCCCA 


AACGGATCGT 


1920 


CGCCGTGAAA 


CGCCCGCTTT 


CGGGCGACGC 


CGGCCGCCCC 


CGCCGCCGCC 


GCCAAACCGC 


1980 


CGAACGACGC 


CGCGTGGTCA 


TCCTCGTCGC 


CGAAATCCCC 


AAAGTTAAAC 


ACCTCCCCGG 


2040 


CGGCGCCGAG 


CTGGCTGACC 


AGGGCCTCCG 


CCTCGTGGGC 


CACCTCCAGG 


GCCGCGTCGG 


2100 


TCGACCACTC 


GCCGTGCCCG 


CGCTCCAGGG 


CGCGGGTGGT 


AAACTCCATC 


ATTTCCTCGC 


2160 


TCAGGTACTC 


GTCCTCCAGC 


AGCGCCAGCC 


AGTCCTCGAT 


CTGCAGCTGC 


TGGGTGCGGG 


2220 


GGCCCAGGCT 


CTTGACGGTC 


GCCACAAACA 


CGCTGCTGGC 


GACCGCCGCC 


CCGCCCTCCG 


2280 


CAATGATGCC 


CCGGAGCTGC 


TCGCACAGCG 


AATGCTCGTG 


GGCCCCGCCC 


CCGAGACTCG 


2340 


ACGCCGCGCA 


CACAAACCCG 


GCCCTGGGGC 


AGGCCAGGAC 


AAACTTGCGG 


GTGCGGTCAA 


2400 


AGATCAGCAG 


CGGGCACGCG 


TTTTTGCCGC 


CCAGCAGGCT 


GGCCCAGTTC 


CCGGCCTGAA 


2460 


ACACGCGGTC 


GTTGCCGGCC 


ATGCCGTAGT 


ATTTGCTGAT 


GCTGAGGCCC 


AGCACGACCA 


2520 


TCGGGCGCGC 


GGCCATCACG 


GGCCGCAGCA 


GGTTGCAGCT 


CGCGAACATG 


GACGTCCAGG 


2580 


CGCCGGGGTG 


CGCGTCGAGG 


GAGTCCATCA 


GCGCGCGGGC 


CCCGGCCTCC 


AGGCCCGCGC 


2640 


CGCCCTGCGG 


GGCCCAGGCG 


GCGGCCGCCT 


GCACGCCGGG 


GGGACGGCGG 


GACCCGGCGA 


2700 


TGACGGCCGT 


GAGGGTGTTT 


ATGAAGTACG 


TCGAGTGGTC 


GCAGTACCTC 


AAGATCTGGT 


2760 


TGGCCATGTA 


GTACATGGCC 


AGTTCGCTCA 


CGTTATTGGG 


GGCCAGGTTG 


ATAAAGTTAA 


2820 


TCGCGCCGTA 


GTCCAGGGAG 


AACCTCTTAA 


TGAACGCGAT 


GGTCTCTATG 


TCCTCGCGCG 


2880 


ACAAGAGCCG 


GGCGGGGAGC 


TGGTTGCGCT 


GGAGGGCGGT 


CCAGAACCAC 


TGCGGGTTCG 


2940 


GCTGGTTCGA 


CCCCGGGGGC 


TTGCCGTTGG 


GAAAGATGAC 


CGCGTGGAAC 


TGCTTCAGCA 


3000 
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GGAAGCCCAG CGGTCCGAGG AGGATGTCCA 
GGAGGCTGGC GACCCGCGCC TTGGCGGCCT 
ACACGCGGCT CTTGACGCGC AGCTCCTTGG 
CCTCGAAGCT GCTCTCGGCG GGGGCCGTCT 
5 CCGCCCCCTC CGAGAGCGCG ACCGTCAGCG 
GGTCCATGAC GCGCCGCCGC AGCACCGGTT 
CCGACTGCCC GGCGAACACC CCGTGGCACT 
GGTTGGACCG CCGCGCGAGG AGCTGCAGCA 
GCGACAGCGA CATGGCGTGG TTGGCCTCGG 

10 CCTCGATCAG GTTGCGCATC AGCTGTTCCA 
TCACCACCGT GTGCAGGGCC TCGCGGGTGC 
TGGGCACCGC CTGGTCCACG TACTGCAGGG 
CCGCGCGGTA CGTCTCCTGC ATGATGGTCC 
GGGCCGAGAA GGCGGCGTAG TTCCCCAGCA 

15 TTCCGAAGAC CCCAATGGCC CCCCGGGCGG 
GCCGCATCAG CGTCGTGTGC GCGCAGGCGT 
GCACGTCGGT CTGGCCCGAG TCCGCGACGT 
CGATGACTCC GCCGTCGCAG CGCTCCAGGT 
AGAACCCGCA CAGCATGGCC AGGTGCTCGC 

20 GCGCCGTGGG GCGCCCCTCG TACCCGGGCA 
TGGCCGCCAC GTGCGTGCCG GGCACGAGAA 
CCTTGGGGTC CGCCGGCCCG GCGTCGTCCA 
TGAACACCAT GGCGCCCACG AGGCCCGCGG 
CACGGGCCGC GGGCGTTTCC TGGCCCTCAA 

25 GCTCGTCAAA GACCGCCATC GACACGATGG 
TCACCGAGGC CAGGCGCTGC TCAAACCCGC 
CGCCCCGCTG GGGCTTACCC TGGCTGGCCT 
GGGCCGCGCC CTCGTGGTTT TCGTCGAACG 
CCACGTTCCG AGCACGCAGG GCCACGGCGG 

30 GGGCGAGGGG GCGGTTGAAA AACGGAAGGG 
GGTTGCAGTT AAACGGATCG GCGATGACCC 
GATACACGGG GATGCGGTGA ACCTCCGCGT 
GGTGCAGGAA GGTGTTGCTG ATGCACACGG 
ACAGCAAGGC CCGGTCCGGG TCCAGTCCGA 

35 CGTGCTTTAG GTCGCAGGGC CGGGGCGCGT 
GCTCGCAGAG CCGCGTCAGG TTGGGGGCCT 
GAAAGACGTA GACGGACGGG CTGTAGTGCG 
CCCCAAGGCC CGTGGTGCGG GACCCGACGA 
CCACGGTCAG GCCGACGATG AGGGGCGCGA 

40 ACAGTAGCGA CAGCAGCTCC AGGCCTTCGG 
TCGGCCCCGG AGGAACCTTG ACGGTGGTCG 
GGAGATTGGC GACCGGCAGG AACGGGGGCC 
GCCGCGCGTG GTCGACGGCT GCTGCCCGCC 
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CGCGCTTGTC 


GGGCTTCTGG 


TAGGCGCTCT 


3060 


CGGACGCGTT 


GGCGCTCGCG 


CCCGCGAACA 


3120 


GAAACCCCAG 


GGTCACGCGG 


GCAACGTCGC 


3180 


GGCCGGCCGT 


TAGGCTGGGG 


GCGCAGATAG 


3240 


TCTTCGCCGA 


CAGGAACCCG 


TTGTTGAACA 


3300 


GGAATTGATT 


GCGAAAGTTG 


CGCCCCTCGA 


3360 


GGCTCAGGGC 


CAGGTCCTGG 


TACACGGCGA 


3420 


GGGGGCACGG 


CCCGCAGGTG 


TACGGGTCCA 


3480 


CCAGACCGTC 


GCGGAACTTA 


AAGTTGCGCC 


3540 


CCTCGCGATC 


CACCAGCTGC 


TTGATGTTGT 


3600 


CGATAATCGT 


CTCCAGCCTC 


CCCAGGGCCG 


3660 


CCTCGAGCTC 


GGCCATGACG 


CGCTCGGTGG 


3720 


GGGTGTTCTC 


GGACCCGTCC 


GCGCGCTTCA 


3780 


CGTCGCAGTC 


GCTGTACGCG 


CTGTTCATCG 


3840 


CGCTCGCGAA 


CTTGGGGTGG 


CGGGCCCGCA 


3900 


GGCGGGTCTC 


GAAGGTACAC 


AGGTTGCAGG 


3960 


AGCGAAACAC 


GTCCATCTCC 


TGGCGCCCGA 


4020 


AAAACAGCAT 


CTTGGCCAGC 


AGGGCCGGAG 


4080 


CGGCGAACTC 


CTGGGTTCCG 


CCGACGAGGG 


4140 


CCACGTGGCC 


CTCGCGGTCC 


AGCTGCGGGT 


4200 


AGAAGCGGTA 


AAAGGAGGGC 


TTGCTGTGGT 


4260 


CCTCGGTCAG 


GTGGAGGGCC 


GAGTTGGTGC 


4320 


CGCGCGCCAG 


GTACGCCCCG 


ACGGCGCCGG 


4380 


GCAGGGGCCA 


CGTGGTGATG 


TCGGGGGGCG 


4440 


ACTCCAGGGC 


CAGGGCGGCG 


TCGCCCGCCA 


4500 


CCGCCGGGCC 


CTTGTTCCCG 


GCGTCGCGCG 


4560 


CGAAGGCCGT 


GAACGTAATG 


TCGGCGGGGA 


4620 


CCAGGTGGGC 


GGCCGCGCGG 


GCCACGGCGT 


4680 


CGGGCCCGAC 


GACCGCCTCG 


AACAGCAGGC 


4740 


GGTAGTTGAA 


ATTCTCCCCG 


ATCGATCGGT 


4800 


GGCTAAAATC 


CGGCATAAAC 


ATCTGCAGCG 


4860 


CCCCGATGGT 


TACCTTGTCC 


ATCCCGCCCA 


4920 


CCTCCCGGAA 


GCCCTCCGTG 


ATCACCAGAT 


4980 


GCCGCTCGCA 


CAGCGCGTCC 


CCCGTCGTCT 


5040 


AGTCCGAGAA 


GCCAAAATGG 


CGGCGCGCCC 


5100 


GGGTGCTGGG 


GGCCAGGTGG 


CGGCCGCCGT 


5160 


AGGGCATAAG 


CTTGAGGGAC 


ACCGCGGTCC 


5220 


CCGCGGCCAC 


GTTGGCCTCA 


AACCCGCTCT 


5280 


CGGCGACGTC 


CGCGTCGCCG 


CTGCGCGCCG 


5340 


CCGGACAGGC 


GCGGCCATAC 


ACGTACCCCA 


5400 


TCGTTTTGGG 


CTTGGTGTCC 


ATGGCTTTCG 


5460 


CGGCAAGACG 


ACCGGGGGCA 


GACGGGGGAG 


5520 


GTCGTCTCTC 


CGATGGGGTC 


GAATGCCGGC 


5580 
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GCTGGGGGTG 


GGGTCTACAC 


CCGCCCGTTC 


ACCGAGCGGC 


CCCTGGTGGG 


GGTGGGATGG 


5640 


GTGGGATGGG 


GTGGGCGAGA 


ATGGCCCGCC 


ACCGGATCGC 


GCCGGACGGG 


GGGGCCCGGG 


5700 


GTTGGGCAAG 


GTTTGGGCGC 


AAGGCTCCAG 


CGGCGATTCG 


AGAGGCCTGC 


GGATGGCGGC 


5760 


CCAGAGCTGG 


GTATGCTCGG 


CCGGGGCGGC 


CGGTATATGT 


ACGGCGTGCT 


GGGAGGGGCG 


5820 


GCGTCGGGCC 


CCGCCCACGG 


TCCGCCACGC 


CCCGCGCGTC 


ATCGGCAGGG 


GGCGTGGCCG 


5880 


CCCTTCTAAA 


AAAAGTGAGA 


ACGCGAAGCG 


TTCGCACTTT 


GTCCTAATAG 


TATATATATT 


5940 


ATTAGGACAA 


AGTGCGAACG 


CTTCGCGTTC 


TCACTTTTTT 


TAGAAGGGCG 


GCCACGCCCC 


6000 


CTTTGACGTC 


ACGCTCACCC 


GGGCGGCCGG 


CCGCCCATAA 


GCGCGGCCTG 


CCGGGCCGAT 


6060 


AAAAAGAAAC 


CGCGGCGCCC 


CCGCGGACAC 


CACACACTGG 


CTCTCGAACC 


CCGGACGCGC 


6120 


AGAAGGGACC 


CGGGCGCGGG 


TCCGCCGGTA 


AGAGCCGGGG 


GGAACATCGG 


CACCGCCATC 


6180 


CCACCCCGAG 


CTGTTGGGTG 


GGCGGGTGGG 


GGGGCTGGTG 


AGGCGGTGGT 


GGGAGGGGGC 


6240 


GGCGTATAGC 


AGGACAACGA 


CCGGCGGCGA 


TGTTTTGTGC 


CGCGGGCGGC 


CCGACTTCCC 


6300 


CCGGGGGGAA 


GTCGGCGGCT 


CGGGCGGCGT 


CTGGGTTTTT 


TGCCCCCCAC 


AACCCCCGGG 


6360 


GAGCCACCCA 


GACGGCACCG 


CCGCCTTGCC 


GCCGGCAGAA 


CTTCTACAAC 


CCCCACCTCG 


6420 


CTCAGACCGG 


AACGCAGCCA 


AAGGCCCCCG 


GGCCGGCTCA 


GCGCCATACG 


TACTACAGCG 


6480 


AGTGCGACGA 


ATTTCGATTT 


ATCGCCCCGC 


GTTCGCTGGA 


CGAGGACGCC 


CCCGCGGAGC 


6540 


AGCGCACCGG 


GGTCCACGAC 


GGCCGCCTCC 


GGCGCGCCCC 


TAAGGTGTAC 


TGCGGGGGGG 


6600 


ACGAGCGCGA 


CGTCCTCCGC 


GTGGGCCCGG 


AGGGCTTCTG 


GCCGCGTCGC 


TTGCGCCTGT 


6660 


GGGGCGGTGC 


GGACCATGCC 


CCCGAGGGGT 


TCGACCCCAC 


CGTCACCGTC 


TTCCACGTGT 


6720 


ACGACATCCT 


GGAGCACGTG 


GAACACGCGT 


ACAGCATGCG 


CGCCGCCCAG 


CTCCACGAGC 


6780 


GATTTATGGA 


CGCCATCACG 


CCCGCCGGGA 


CCGTCATCAC 


GCTTCTGGGT 


CTGACCCCCG 


6840 


AAGGCCATCG 


CGTCGCCGTT 


CACGTCTACG 


GCACGCGGCA 


GTACTTTTAC 


ATGAACAAGG 


6900 


CAGAGGTGGA 


TCGGCACCTG 


CAGTGCCGTG 


CCCCGCGCGA 


TCTCTGCGAG 


CGCCTGGCGG 


6960 


CGGCCCTGCG 


CGAGTCGCCG 


GGGGCGTCGT 


TCCGCGGCAT 


CTCCGCGGAC 


CACTTCGAGG 


7020 


CGGAGGTGGT 


GGAGCGCGCC 


GACGTGTACT 


ATTACGAAAC 


GCGCCCGACC 


CTGTACTACC 


7080 


GCGTCTTCGT 


GCGAAGCGGG 


CGCGCGCTGG 


CCTACCTGTG 


CGACAACTTT 


TGCCCCGCGA 


7140 


TCAGGAAGTA 


CGAGGGGGGC 


GTCGACGCCA 


CCACCCGGTT 


TATCCTGGAC 


AACCCGGGGT 


7200 


TTGTCACCTT 


CGGCTGGTAC 


CGCCTCAAGC 


CCGGCCGCGG 


GAACGCGCCG 


GCCCAACCGC 


7260 


GCCCCCCGAC 


GGCGTTCGGA 


ACCTCGAGCG 


ACGTCGAGTT 


TAACTGCACG 


GCGGACAACC 


7320 


TGGCCGTCGA 


GGGGGCCATG 


TGTGACCTGC 


CGGCCTACAA 


GCTCATGTGC 


TTCGATATCG 


7380 


AATGCAAGGC 


CGGGGGGGAG 


GACGAGCTGG 


CCTTTCCGGT 


CGCGGAACGC 


CCGGAAGACC 


7440 


TCGTCATCCA 


GATCTCCTGT 


CTGCTCTACG 


ACCTGTCCAC 


CACCGCCCTC 


GAGCACATCC 


7500 


TCCTGTTTTC 


GCTCGGATCC 


TGCGACCTCC 


CCGAGTCCCA 


CCTCAGCGAT 


CTCGCCTCCA 


7560 


GGGGCCTGCC 


GGCCCCCGTC 


GTCCTGGAGT 


TTGACAGCGA 


ATTCGAGATG 


CTGCTGGCCT 


7620 


TCATGACCTT 


CGTCAAGCAG 


TACGGCCCCG 


AGTTCGTGAC 


CGGGTACAAC 


ATCATCAACT 


7680 


TCGACTGGCC 


CTTCGTCCTG 


ACCAAGCTGA 


CGGAGATCTA 


CAAGGTCCCG 


CTCGACGGGT 


7740 


ACGGGCGCAT 


GAACGGCCGG 


GGTGTGTTCC 


GCGTGTGGGA 


CATCGGCCAG 


AGCCACTTTC 


7800 


AGAAGCGCAG 


CAAGATCAAG 


GTGAACGGGA 


TGGTGAACAT 


CGACATGTAC 


GGCATCATCA 


7860 


CCGACAAGGT 


CAAACTCTCC 


AGCTACAAGC 


TGAACGCCGT 


CGCCGAGGCC 


GTCTTGAAGG 


7920 


ACAAGAAGAA 


GGATCTGAGC 


TACCGCGACA 


TCCCCGCCTA 


CTACGCCTCC 


GGGCCCGCGC 


7980 


AGCGCGGGGT 


GATCGGCGAG 


TATTGTGTGC 


AGGACTCGCT 


GCTGGTCGGG 


CAGCTGTTCT 


8040 


TCAAGTTTCT 


GCCGCACCTG 


GAGCTTTCCG 


CCGTCGCGCG 


CCTGGCGGGC 


ATCAACATCA 


8100 


CCCGCACCAT 


CTACGACGGC 


CAGCAGATCC 


GCGTCTTCAC 


GTGCCTCCTG 


CGCCTTGCGG 


8160 
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GCCAGAAGGG 


CTTCATCCTG 


CCGGACACCC 


AGGGGCGGTT 


TCGGGGCCTC 


GACAAGGAGG 


8220 


CGCCCAAGCG 


CCCGGCCGTG 


CCTCGGGGGG AAGGGGAGCG GCCGGGGGAC GGGAACGGGG 


8280 


ACGAGGATAA 


GGACGACGAC 


GAGGACGGGG 


ACGAGGACGG 


GGACGAGCGC 


GAGGAGGTCG 


8340 


CGCGCGAGAC 


CGGGGGCCGG 


CACGTTGGGT ACCAGGGGGC 


CCGGGTCCTC 


GACCCCACCT 


8400 


CCGGGTTTCA 


CGTCGACCCC 


GTGGTGGTGT 


TTGACTTTGC 


CAGCCTGTAC 


CCCAGCATCA 


8460 


TCCAGGCCCA 


CAACCTGTGC 


TTCAGTACGC 


TCTCCCTGCG 


GCCCGAGGCC 


GTCGCGCACC 


8520 


TGGAGGCGGA 


CCGGGACTAC 


CTGGAGATCG 


AGGTGGGGGG 


CCGACGGCTG 


TTCTTCGTGA 


8580 


AGGCCCACGT 


ACGCGAGAGC 


CTGCTGAGCA TCCTGCTGCG CGACTGGCTG GCCATGCGAA 


8640 


AGCAGATCCG 


CTCGCGGATC 


CCCCAGAGCA 


CCCCCGAGGA 


GGCCGTCCTC 


CTCGACAAGC 


8700 


AACAGGCCGC 


CATCAAGGTG 


GTGTGCAACT 


CGGTGTACGG 


GTTCACCGGG 


GTGCAGCACG 


8760 


GTCTTCTGCC 


CTGCCTGCAC 


GTGGCCGCCA 


CCGTGACGAC 


CATCGGCCGC 


GAGATGCTCC 


8820 


TCGCGACGCG 


CGCGTACGTG 


CACGCGCGCT 


GGGCGGAGTT 


CGATCAGCTG 


CTGGCCGACT 


8880 


TTCCGGAGGC 


GGCCGGCATG 


CGCGCCCCCG 


GTCCGTACTC 


CATGCGCATC 


ATCTACGGGG 


8940 


ACACGGACTC 


CATTTTCGTT 


TTGTGCCGCG 


GCCTCACGGC 


CGCGGGCCTG 


GTGGCCATGG 


9000 


GCGACAAGAT 


GGCGAGCCAC 


ATCTCGCGCG 


CGCTGTTCCT 


CCCCCCGATC 


AAGCTCGAGT 


9060 


GCGAAAAAAC 


GTTCACCAAG 


CTGCTGCTCA 


TCGCCAAGAA 


AAAGTACATC 


GGCGTCATCT 


9120 


GCGGGGGCAA 


GATGCTCATC 


AAGGGCGTGG 


ATCTGGTGCG 


CAAAAACAAC 


TGCGCGTTTA 


9180 


TCAACCGCAC 


CTCCAGGGCC 


CTGGTCGACC 


TGCTGTTTTA 


CGACGATACC 


GTATCCGGAG 


9240 


CGGCCGCCGC 


GTTAGCCGAG 


CGCCCCGCAG 


AGGAGTGGCT 


GGCGCGACCC 


CTGCCCGAGG 


9300 


GACTGCAGGC 


GTTCGGGGCC 


GTCCTCGTAG 


ACGCCCATCG 


GCGCATCACC 


GACCCGGAGA 


9360 


GGGACATCCA 


GGACTTTGTC 


CTCACCGCCG 


AACTGAGCAG 


ACACCCGCGC 


GCGTACACCA 


9420 


ACAAGCGCCT 


GGCCCACCTG 


ACGGTGTATT 


ACAAGCTCAT 


GGCCCGCCGC 


GCGCAGGTCC 


9480 


CGTCCATCAA 


GGACCGGATC 


CCGTACGTGA 


TCGTGGCCCA 


GACCCGCGAG 


GTAGAGGAGA 


9540 


CGGTCGCGCG 


GCTGGCCGCC 


CTCCGCGAGC 


TAGACGCCGC 


CGCCCCAGGG 


GACGAGCCCG 


9600 


CCCCCCCAGC 


GGCCCTGCCC 


TCCCCGGCCA 


AGCGCCCCCG 


GGAGACGCCG 


TCGCATGCCG 


9660 


ACCCCCCGGG 


AGGCGCGTCC 


AAGCCCCGCA 


AGCTGCTGGT 


GTCCGAGCTG 


GCGGAGGATC 


9720 


CCGGGTACGC 


CATCGCCCGG 


GGCGTTCCGC 


TCAACACGGA 


CTATTACTTC 


TCGCACCTGC 


9780 


TGGGGGCGGC 


CTGCGTGACG 


TTCAAGGCCC 


TGTTTGGAAA 


TAACGCCAAG 


ATCACCGAGA 


9840 


GTCTGTTAAA 


GAGGTTTATT 


CCCGAGACGT 


GGCACCCCCC 


GGACGACGTG 


GCCGCGCGGC 


9900 


TCAGGGCCGC 


GGGGTTCGGG 


CCGGCGGGGG CCGGCGCTAC GGCGGAGGAA ACTCGTCGAA 


9960 


TGTTGCATAG 


AGCCTTTGAT 


ACTCTAGCAT 


GAGCCCCCCG 


TCGAAGCTGA 


TGTCCCGCAT 


10020 


CTTGCAATAA 


ATGTCTGCGG 


CCGACACGGT 


CGGAATTTCC 


GCGTCCGCTG 


GTTTCTCTGC 


10080 


GTTGCGTCTG 


ACCACGAGCA 


CAAACGTGCT 


CTGCCACACG 


TGGGCGGCGA 


ACCGGTAGCC 


10140 


GGGGCACGCG 


GTCAGCATCC 


GATCGATGAG 


CCGGTAGTGC 


AGGTGGGCCG 


ACGTGCCGGG 


10200 


GAAGATGACG 


TACAGCATGT 


GGCCCCCGTA 


CGTGGGGTCC 


GGGTAAAAAA 


GAAACCGGGG 


10260 


GTCGCACGCC 


CCCCCTCCGC 


GCAGGATCGT 


GTGCACGAAA 


AAGAGCTCGG 


GCTGGCCGAG 


10320 


CGTATCGGCC 


AGGAGGTCCT 


GGAGGGGGGT 


GCTGTGGCGG 


TCGGCCAGCA 


CGACCAGGGA 


10380 


GGCCAGAAAG 


GTGCGGTGCT 


CAAAGATCGT ATTGATCTGC TGCACGAAGG CCAGGATGAG 


10440 


GGCCTCGCGG 


CTGACGGTGG 


CCAGCCGCCC 


GTCGCCCGCG 


CTGCACGCGG 


GGCAGCAGCC 


10500 


CCCGATCCCC 


AGGTAGTAGC 


CCATGCCCGA GAGGGTCAGG 


CAGTTGTCGG 


CCACGGTCTG 


10560 


GTCCAGGCTG 


AAGGGGAGCG 


ACACGGGGGT CGTCTTCACC AGGGGCACGG ATAGCGAGCG 


10620, 


CACGATGGCG 


ATCTCCTCGG 


AGGGCGTCTG GGCGAGGGCG GCGAAGAAGC CGCGGTAGCG 


10680 


ACGGCGCTCG 


TGCAGGCAGA 


GCTCCAGCCT GCGCGCGTGC GACGGCAGGC 


TCTTGCGGGA 


10740 
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GGCCCGGCGC 


TCCACGCCGG 


GGTTCCCGGC 


GGCGGAAAAG 


CGCGACCGCC 


GCCGGGTCTT 


10800 


GTCGCGGCCG 


GGCCCGGGCC 


GGGAGCCGGA 


GCGACGGGGG 


GCGATGTCAT 


ACATAGGTAC 


10860 


AGAGGGTGTG 


CTCCAGGGAC 


AGGAGAGAGA 


TCGAGTGTCG 


TCTGAGCAGC 


GCGCCGGCCT 


10920 


CGCGGACAAA 


TGTGGCCAGC 


GCGGTGGGCT 


TCGGCACAAA 


TACCTGGTAC 


GTCTTGAAGG 


10980 


TGTAGATGAG 


GGCCCGCAGG 


GCTATACAGA 


CCCGCCCCTC 


GAACTCGTTG 


CCGCAGGCCA 


11040 


ACTTGGCCTT 


GTGAAGCTGC 


AGCTCGTCGC 


GATGGTCGGC 


GCGGGGGTGG 


CCAAACAGGA 


11100 


CCCAGGGGTC 


GACTTCCATC 


TCCGTGATGG 


CGCACATGGG 


ATCGCAGAAC 


ATGTGCTTGA 


11160 


AGATGGCCTC 


GGGGCCCGCG 


GCCCGAAGCA 


GGCTCACGAA 


CCGGCCCCCG 


TCCCCGGGCT 


11220 


GCGCCTCGGG 


GTCCGCCTCG 


AGCTGGTCCA 


CGACCGGCAC 


TATGCAGTCG 


AAGAGGCTGG 


11280 


TGTTGTTCTC 


CGAGTAGCGG 


ACGACGGACG 


CCCTCAGGCG 


TCGCATGGCC 


AGCCAGTAGG 


11340 


CCCGCACCAG 


CAACAGATTG 


CACAGCAGGC 


ATTCCCCGCC 


GGTGCGCCCG 


CGGCCCCGGC 


11400 


CGTGCTTCAG 


CACGGTGGCC 


ATCAGCGGGC 


CCAGGTCCAG 


GTCGGGCTGG 


GCCTTGGGCT 


11460 


CGGCGAACTG 


CGCAAAGCGC 


GGGGCCGCGT . 


CGCGCATGCG 


CGCCCCGCGG 


TGCGCTTCCC 


11520 


AGGACTCGCT 


GACCGCGGCG 


CGGCGGGCGT 


CCGCGGCGGC 


GCGCAGCCGG 


GGCCCCGACT 


11580 


CCCAGACGGC 


GGGGGTGCCG 


GCGAGCAGCA 


GCAGGATCAG 


GTCGGCGTAC 


GCCCACGTCT 


11640 


CCGGCTCACC 


CCCCTGCGCC 


AGCGCCCCGG 


CGGCGGCCTC 


GAACTCCCCG 


TTGCGGGCGG 


11700 


CGGCGCGCGT 


GCAGCAGCTG 


TCTCCGCCCC 


CGCGCTTGCC 


CTCGGTGCAG 


TCGAGCAGGC 


11760 


GGGCGCAGTC 


CTTCCAGTTC 


ATCAGGGCGG 


TGGTGAGGGA 


GGGTTGCGTT 


CCCGAGCCCC 


11820 


CGCCCGCCCC 


CGCCCCGTCA 


TCGCCCCCGG 


AGGCCAGGGT 


CCCGATGAGG 


GCCCGGGTTG 


11880 


CGGACTGCGC 


GAGGAAGGAA 


TAGTTGGAGT 


ACTGCACCTT 


GGCGGCGCCC 


GGGGAGGGCG 


11940 


TCGGCCTGGG 


TTGCTTCTGG 


GCGTGGCGCC 


CGGGCACCCC 


GCCGTCGGTC 


CGGAAGCAGC 


12000 


AGTGGAGAAA 


GAAATGCCGG 


TGGATGTCGT 


TGATGGTCAG 


GGCGAAGCGC 


GCGAAGGAGC 


12060 


CGACAAGGGT 


CGCCTTCTTG 


GTGCGCAGGA 


AGTGGTGGTC 


CATGACGTAG 


ACGAACTCGA 


12120 


AGGCGGCCAC 


GAAGATGCTC 


GCGGCGCAGT 


GGGGCGCGCC 


CAGGCACTTG 


GCGCAGAGGA 


12180 


ACGCGTAATC 


GGCCACCCAC 


TGGGGCGAGA 


GGCGGTAGGC 


CTGCTTGTAC 


AGCTCGATGG 


12240 


TGCGGCAGAC 


CAGACAGGGG 


CGGTCCAGCG 


CGAAGGTGTC 


GACGGACGCC 


GCGGCGAAGG 


12300 


GCCCCGTGTC 


CAAGAGTCCC 


TCTGCCGTGG 


GGTCTGCGGG 


CGGGCCGCGG 


GCGGACCCCG 


12360 


GCCCCCGCCC 


CCCCGAAGCC 


TCGCGCGCGG 


CCCCGCGCGG 


CCGCGGGGGG 


GCGGGCGCGA 


12420 


CGTCGCTCTC 


CACGTCCTCG 


TCGAGCGCGC 


TCGCGGGCGG 


CACGCCTACC 


ACGTGACAGG 


12480 


CCGCCAGGAG 


CTCGGCGCAC 


AGGGCCTCGT 


TAAGAGCCAG 


AAGGTCGGGA 


TCGAAGGCCA 


12540 


CATACGGACG 


CTCGAACGCG 


CCCTCCTTCC 


AGCTGCTGCC 


CGGCGACTGT 


TCGCGCACGG 


12600 


CGGCGCTCGA 


CGGCACCCCC 


GGGGCGGACG 


TCGCCATGGC 


CGGTCGAGCG 


GGGCGCACGC 


12660 


GTCCGCGAAC 


GTTACGGGAC 


GCGATCCCCG 


ACTGCGCGCT 


GCGGTCCCAG 


ACCCTGGAAA 


12720 


GTCTAGACGC 


GCGCTACGTC 


TCGCGAGACG 


GCGCGGGGGA 


CGCGGCCGTC 


TGGTTCGAGG 


12780 


ACATGACCCC 


CGCCGAACTA 


GAGGTTATAT 


TCCCGACCAC 


GGACGCCAAG 


CTGAACTACC 


12840 


TCTCGCGGAC 


GCAGCGGCTG 


GCCTCCCTCC 


TGACGTACGC 


CGGGCCTATA 


AAAGCGCCCG 


12900 


ACGGCCCCGC 


CGCCCCACAT 


ACGCAGGACA 


CCGCGTGCGT 


GCACGGCGAG 


CTGCTCGCCC 


12960 


GAAAGCGCGA 


ACGGTTCGCG 


GCGGTCATTA 


ACCGGTTCCT 


GGACCTGCAC 


CAGATCCTGC 


13020 


GGGGCTGACG 


CGCGCTTCGG 


CGGGGCACCG 


GCACCGGGAC 


CGACTTGTTT 


TACATAACAG 


13080 


TAGGGGGTGG 


GGGAACGCGC 


ACCCTTGCCC 


GGTCGCGATG 


GCGGGGATGG 


GGAAGCCCTA 


13140 


CGGCGGCCGC 


CCGGGGGACG 


CGTTCGAGGG 


TCTCGTTCAG 


CGCATCAGGC 


TCATTGTTCC 


13200 


CACCACGCTG 


CGCGGCGGGG 


GTGGGGAGTC 


GGGCCCCTAC 


TCGCCATCCA 


ACCCGCCCTC 


13260 


GAGATGTGCC 


TTCCAGTTCC 


ACGGCCAGGA 


TGGGTCCGAC 


GAGGCCTTCC 


CGATCGAGTA 


13320 
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CGTCCTGCGG CTCATGAACG ACTGGGCCGA 
GAACACCGGC GTTTCGGTGC TGTTTCAGGG 
GGGCGCGATC ACGGCGGAGC AGACCAACGT 
GTCCCTCGGA GACCTGGACG ACGTCAAGGG 
5 GGCCAGCATG TGGATCAGCT GCTTTGTGCG 
CATGGGCCCC GAGGACGCCG TTCGCACGCG 
CCTCGCCCGT CGCCGCCGGT CCAGGCGGTC 
GGCGGCGCAC CACTCTTCCG GAGCGCCCGG 
GCCGCCCGGA CGGGGACCGG CCCGTCCGTG 

10 GCGTCCGGGC CCCCCGGCGC TTCTGTTGCT 
CTGGTGGGCG GTTGGCGCGC GCCTATGAAA 
CATCCCAGAC GCCCGCGAGC CGCACATCCC 
GGCGCGACCC AAGGTCCCGA TGGCCGCCCC 
CGACAACGTC CGGGCGCTCG GCATGCGCGG 

15 CATCATGGAT AACAGCTACC CGCATCCGCA 
TCGCGGGCAG GCCGCGGCGC TGACGGACCT 
CCCGCAGCCT ATGTTCGCGG GCGACGCCGC 
TAAGCGCACG TATTCCCCCT TTGTCGTTCG 
CCTCGGCGGG TCCCTCCGCG GCCGTCTCTC 

20 TTCAATAAAA AACACCAACA TACGATATTC 
GGCCCAACGA TCGGCGATTA ACAACACCAA 
CGCACGTGAT GTAGGCTGGT CAGCACGGCG 
CGCTGCAGCT GTTGTTGTAT GCGGCGGCAT 
CCGGTGCTTC GTACGTAGCG TCGCGACAAG 

25 AATTGCGAGT GTGGTGACTG GAGGTGGTCG 
GGGGGCAAGT GCGGTTCCGG TGGGAGGGGG 
AAACGCAGGG AGTCTGCGTC GGAGTGTTCA 



TGTGCCCTGC 


AACCCCTACC 


TGCGCGTGCA 


13380 


GTTTTTTAAC 


CGGCCCCACG 


GCGCCCCGGG 


13440 


GATTCTGCAC 


TCCACCGAGA 


CGACGGGACT 


13500 


GCGCCTCGGC 


CTGGACGCCC 


GGCCGATGAT 


13560 


CATGCCCCGG 


GTGCAGCTCG 


CGTTTCGGTT 


13620 


GCGGATCCTG 


TGTCGCGCCG 


CCGAGCAGGC 


13680 


CCAGGATGAC 


TACGGGGCGG 


TGGCGGTGGC 


13740 


GCCGGGGGTC 


GCCGCCTCGG 


GCCCGCGAGC 


13800 


GCATCAGGCC 


GTGCAGTTGT 


TCCGGGCCCC 


13860 


GGTGGCGGGG 


CTGTTTCTGG 


GGGCCGCTAT 


13920 


GGGGGCGAGC 


CACCGTCCCG 


CCCGCCAGTG 


13980 


CTCCGCTCCC 


GCCTCCGGCC 


CGATTCTTAC 


14040 


GCAGTTTCAC 


CGCCCCAGCA 


CCATTACCGC 


14100 


GCTCGTGTTG 


GCCACCAACA 


ACGCTCAGTT 


14160 


CGGAACGCAG 


GGTGCGGTGC 


GAGAGTTTCT 


14220 


CGGGGTGACC 


CACGCCAACA 


ACACGTTCGC 


14280 


GGCCGAATGG 


CTGCGGCCCT 


CGTTCGGTCT 


14340 


CGACCCCAAG 


ACCCCCAGCA 


CCCCGTGAGT 


14400 


GTTGCCCCCC 


CTTTCCCCCT 


TCCCGGGTGG 


14460 


GCGTTTGATA 


CGTTTATTGG 


GGGGGTGTAG 


14520 


ACAATCGAGC 


GCGTCTAACC 


CAGTAACATG 


14580 


TTGCTGCGCT 


GAAACAGCGC 


CCTGCGGGTC 


14640 


GCGCGGATCA 


AAACCGCCAG 


GGCGCTACGA 


14700 


ACGGCATTTG 


CCTGTACGGG 


CAAGGGGCCA 


14760 


GCGGCCAATG 


GGCCGGGTGG 


TTCGTCGGCG 


14820 


TCGAGCGCCT 


CGGTATCATC 


CGAGTCCGAG 


14880 


TCATCGGAGG 


AGATGT 




14927 



(2) INFORMATION FOR SEQ ID NO: 131: 

30 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 495 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Met Ala Ala Ala Pro Pro Ala Ala Val Ser Glu Pro Thr Ala Ala Arg 

15 10 15 

Gin Lys Leu Leu Ala Leu Leu Gly Gin Val Gin Thr Tyr Val Phe Gin 
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20 25 30 

Leu Glu Leu Leu Arg Arg Cys Asp Pro Gin lie Gly Leu Gly Lys Leu 

35 40- 45 

Ala Gin Leu Lys Leu Asn Ala Leu Gin Val Arg Val Leu Arg Arg His 
5 50 55 60 

Leu Arg Pro Gly Leu Glu Ala Gin Ala Ala Ala Phe Leu Thr Pro Leu 
65 70 75 80 

Ser Val Thr Leu Glu Leu Leu Leu Glu Tyr Ala Trp Arg Glu Gly Glu 
85 90 95 

10 Arg Leu Leu Gly His Leu Glu Thr Phe Ala Thr Thr Gly Asp Val Ser 
100 105 110 

Ala Phe Phe Thr Glu Thr Met Gly Leu Ala Arg Pro Cys Pro Tyr His 

115 120 125 

Gin Gin lie Arg Leu Glu Thr Tyr Gly Gly Asp Val Arg Met Glu Leu 
15 130 135 140 

Cys Phe Leu His Asp Val Glu Asn Phe Leu Lys Gin Leu Asn Tyr Cys 
145 150 155 160 

His Leu lie Thr Pro Pro Ser Gly Ala Thr Ala Ala Leu Glu Arg Val 
165 170 175 

20 Arg Glu Phe Met Val Ala Ala Val Gly Ser Gly Leu lie Val Pro Pro 
180 185 190 

Glu Leu Ser Asp Pro Ser His Pro Cys Ala Val Cys Phe Glu Glu Leu 

195 200 205 

Cys Val Thr Ala Asn Gin Gly Ala Thr lie Ala Arg Arg Leu Ala Asp 
25 210 215 220 

Arg lie Cys Asn His Val Thr Gin Gin Ala Gin Val Arg Leu Asp Ala 
225 230 235 . 240 

Asn Glu Leu Arg Arg Tyr Leu Pro His Ala Ala Gly Leu Ser Asp Ala 
245 250 255 

30 Ala Arg Ala Arg Ala Leu Cys Val Leu Asp Gin Ala Arg Thr Ala Ala 
260 265 270 

Gly Gly Gly Ala Arg Ala Gly Pro Pro Pro Ala Asp Ser Ser Ser Val 

275 280 285 

Arg Glu Glu Ala Asp Ala Leu Leu Glu Ala His Asp Val Phe Gin Ala 
35 290 295 300 

Thr Thr Pro Gly Ala lie Ser Glu Leu Arg Phe Trp Leu Ala Ser Gly 
305 310 315 320 

Asp Arg Ala Arg His Ser Thr Met Asp Ala Phe Ala Asp Asn Leu Asn 
325 330 335 

40 Ala Gin Arg Glu Leu Gin Gin Glu Thr Ala Ala Val Ala Val Glu Leu 
340 345 350 

Ala Leu Phe Gly Arg Arg Ala Glu His Phe Asp Arg Ala Phe Gly Gly 
355 360 365 
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10 



15 



His Leu Ala Ala Leu Asp Met Val Asp Ala Leu lie lie Gly Gly Gin 

370 375 380 

Ala Thr Ser Pro Asp Asp Gin He Glu Ala Leu He Arg Ala Cys Tyr 
385 390 395 400 

Asp His His Leu Thr Thr Pro Leu Leu Arg Arg Leu Val Ser Pro Glu 

405 410 415 

Gin Cys Asp Glu Glu Ala Leu Arg Arg Val Leu Ala Arg Leu Gly Ala 

420 425 430 

Gly Gly Ala Thr Gly Gly Ala Glu Glu Glu Glu Pro Arg Ala Ala Ala 

435 440 445 

Glu Glu Gly Gly Arg Arg Arg Gly Ala Gly Thr Pro Ala Ser Glu Asp 

450 455 460 

Gly Glu Arg Gly Pro Glu Pro Gly Ala Gin Gly Pro Glu Ser Trp Gly 
465 470 475 480 

Asp He Ala Thr Arg Ala Ala Ala Asp Val Xaa Xaa Xaa Xaa Xaa 
485 490 495 



(2) INFORMATION FOR SEQ ID NO: 132: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1186 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 



30 Met Asp Thr Lys Pro Lys Thr Thr Thr Thr Val Lys Val Pro Pro Gly 
15 10 15 

Pro Met Gly Tyr Val Tyr Gly Arg Ala Cys Pro Ala Glu Gly Leu Glu 

20 25 30 

Leu Leu Ser Leu Leu Ser Ala Arg Ser Gly Asp Ala Asp Val Ala Val 
35 35 40 45 

Ala Pro Leu He Val Gly Leu Thr Val Glu Ser Gly Phe Glu Ala Asn 

50 55 60 

Val Ala Ala Val Val Gly Ser Arg Thr Thr Gly Leu Gly Gly Thr Ala 
65 70 75 80 

40 Val Ser Leu Lys Leu Met Pro Ser His Tyr Ser Pro Ser Val Tyr Val 

85 90 95 

Phe His Gly Gly Arg His Leu Ala Pro Ser Thr Gin Ala Pro Asn Leu 
100 105 110 
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Thr Arg Leu Cys Glu Arg Ala Arg Arg His Phe Gly Phe Ser Asp Tyr 

115 120 125 

Ala Pro Arg Pro Cys Asp Leu Lys His Glu Thr Thr Gly Asp Ala Leu 
130 135 140 

5 Cys Glu Arg Leu Gly Leu Asp Pro Asp Arg Ala Leu Leu Tyr Leu Val 
145 150 155 160 

He Thr Glu Gly Phe Arg Glu Ala Val Cys He Ser Asn Thr Phe Leu 

165 170 175 

His Leu Gly Gly Met Asp Lys Val Thr He Gly Asp Ala Glu Val His 
10 180 185 190 

Arg He Pro Val Tyr Pro Leu Gin Met Phe Met Pro Asp Phe Ser Arg 

195 200 205 

Val He Ala Asp Pro Phe Asn Cys Asn His Arg Ser He Gly Glu Asn 
210 215 220 

15 Phe Asn Tyr Pro Leu Pro Phe Phe Asn Arg Pro Leu Ala Arg Leu Leu 
225 230 235 240 

Phe Glu Ala Val Val Gly Pro Ala Ala Val Arg Ala Arg Asn Val Asp 

245 250 255 

Ala Val Ala Arg Ala Ala Ala His Leu Ala Phe Asp Glu Asn His Glu 
20 260 265 270 

Gly Ala Ala Leu Pro Ala Asp He Thr Phe Thr Ala Phe Glu Ala Ser 

275 280 285 

Gin Gly Lys Pro Gin Arg Gly Ala Arg Asp Ala Gly Asn Lys Gly Pro 
290 295 300 

25 Ala Gly Gly Phe Glu Gin Arg Leu Ala Ser Val Met Ala Gly Asp Ala 
305 310 315 320 

Ala Leu Glu Ser lie Val Ser Met Ala Val Phe Asp Glu Pro Pro Pro 

325 330 335 

Asp He Thr Thr Trp Pro Leu Leu Glu Gly Gin Glu Thr Pro Ala Ala 
30 340 345 350 

Arg Ala Gly Ala Val Gly Ala Tyr Leu Ala Arg Ala Ala Gly Leu Val 

355 360 365 

Gly Ala Met Val Phe Ser Thr Asn Ser Ala Leu His Leu Thr Glu Val 
370 375 380 

35 Asp Asp Ala Gly Pro Ala Asp Pro Lys Asp His Ser Lys Pro Ser Phe 
385 390 395 400 

Tyr Arg Phe Phe Leu Val Pro Gly Thr His Val Ala Ala Asn Pro Gin 

405 410 415 

Leu Asp Arg Glu Gly His Val Val Pro Gly Tyr Glu Gly Arg Pro Thr 
40 420 425 430 

Ala Pro Leu Val Gly Gly Thr Gin Glu Phe Ala Gly Glu His Leu Ala 

435 440 445 

Met Leu Cys Gly Phe Ser Pro Ala Leu Leu Ala Lys Met Leu Phe Tyr 
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450 455 460 

Leu Glu Arg Cys Asp Gly Gly Val He Val Gly Arg Gin Glu Met Asp 
465 470 - 475 480 

Val Phe Arg Tyr Val Ala Asp Ser Gly Gin Thr Asp Val Pro Cys Asn 
5 485 490 495 

Leu Cys Thr Phe Glu Thr Arg His Ala Cys Ala His Thr Thr Leu Met 

500 505 510 

Arg Leu Arg Ala Arg His Pro Lys Phe Ala Ser Ala Arg Ala He Gly 
515 520 525 

10 Val Phe Gly Thr Met Asn Ser Ala Tyr Ser Asp Cys Asp Val Leu Gly 
530 535 540 

Asn Tyr Ala Ala Phe Ser Ala Leu Lys Arg Ala Asp Gly Ser Glu Asn 
545 550 - 555 560 

Thr Arg Thr He Met Gin Glu Tyr Ala Ala Thr Glu Arg Val Met Ala 
15 565 570 575 

Glu Leu Glu Ala Leu Gin Tyr Val Asp Gin Ala Val Pro Thr Ala Leu 

580 585 590 

Gly Arg Leu Glu Thr He He Gly Thr Arg Glu Ala Leu His Thr Val 
595 600 605 

20 Val Asn Asn He Lys Gin Leu Val Asp Arg Glu Val Glu Gin Leu Met 
610 615 620 

Arg Asn Leu He Glu Gly Arg Asn Phe Lys Phe Arg Asp Gly Leu Ala 
625 630 635 640 

Glu Ala Asn His Ala Met Ser Leu Ser Leu Asp Pro Tyr Thr Cys Gly 
25 645 650 655 

Pro Cys Pro Leu Leu Gin Leu Leu Ala Arg Arg Ser Asn Leu Ala Val 

660 665 670 

Tyr Gin Asp Leu Ala Leu Ser Gin Cys His Gly Val Phe Ala Gly Gin 
675 680 685 

30 Ser Val Glu Gly Arg Asn Phe Arg Asn Gin Phe Gin Pro Val Leu Arg 
690 695 700 

Arg Arg Val Met Asp Leu Phe Asn Asn Gly Phe Leu Ser Ala Lys Thr 
705 710 715 720 

Leu Thr Val Ser Glu Gly Ala Ala He Cys Ala Pro Ser Leu Thr Ala 
35 725 730 735 

Gly Gin Thr Ala Pro Ala Glu Ser Ser Phe Glu Gly Asp Val Ala Arg 

740 745 750 

Val Thr Leu Gly Phe Pro Lys Glu Leu Arg Val Lys Ser Arg Val Leu 
755 760 765 

40 Phe Ala Gly Ala Ser Ala Asn Ala Ser Glu Ala Ala Lys Ala Arg Val 
770 775 780 

Ala Ser Leu Gin Ser Ala Tyr Gin Lys Pro Asp Lys Arg Val Asp He 
785 790 795 800 
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Leu Leu Gly Pro Leu Gly Phe Leu Leu Lys Gin Phe His Ala Val lie 

805 810 815 

Phe Pro Asn Gly Lys Pro Pro Gly Ser Asn Gin Pro Asn Pro Gin Trp 
820 825 830 

5 Phe Trp Thr Ala Leu Gin Arg Asn Gin Leu Pro Ala Arg Leu Leu Ser 
835 840 845 

Arg Glu Asp lie Glu Thr lie Ala Phe lie Lys Arg Phe Ser Leu Asp 

850 855 860 

Tyr Gly Ala lie Asn Phe lie Asn Leu Ala Pro Asn Asn Val Ser Glu 
10 865 870 875 880 

Leu Ala Met Tyr Tyr Met Ala Asn Gin lie Leu Arg Tyr Cys Asp His 

885 890 895 

Ser Thr Tyr Phe lie Asn Thr Leu Thr Ala Val lie Ala Gly Ser Arg 
900 905 910 

15 Arg Pro Pro Gly Val Gin Ala Ala Ala Ala Trp Ala Pro Gin Gly Gly 
915 920 925 

Ala Gly Leu Glu Ala Gly Ala Arg Ala Leu Met Asp Ser Leu Asp Ala 

930 935 940 

His Pro Gly Ala Trp Thr Ser Met Phe Ala Ser Cys Asn Leu Leu Arg 
20 945 950 955 960 

Pro Val Met Ala Ala Arg Pro Met Val Val Leu Gly Leu Ser He Ser 

965 970 975 

Lys Tyr Tyr Gly Met Ala Gly Ash Asp Arg Val Phe Gin Ala Gly Asn 
980 985 990 

25 Trp Ala Ser Leu Leu Gly Gly Lys Asn Ala Cys Pro Leu Leu He Phe 
995 1000 1005 

Asp Arg Thr Arg Lys Phe Val Leu Ala Cys Pro Arg Ala Gly Phe Val 

1010 1015 1020 

Cys Ala Ala Ser Ser Leu Gly Gly Gly Ala His Glu His Ser Leu Cys 
30 1025 1030 1035 104 

Glu Gin Leu Arg Gly He lie Ala Glu Gly Gly Ala Ala Val Ala Ser 

1045 1050 1055 

Ser Val Phe Val Ala Thr Val Lys Ser Leu Gly Pro Arg Thr Gin Gin 
1060 1065 1070 

35 Leu Gin lie Glu Asp Trp Leu. Ala Leu Leu Glu Asp Glu Tyr Leu Ser 
1075 1080 1085 

Glu Glu . Met Met Glu Phe Thr Thr Arg Ala Leu Glu Arg Gly His Gly 

1090 1095 1100 

Glu Trp Ser Thr Asp Ala Ala Leu Glu Val Ala His Glu Ala Glu Ala 
40 1105 1110 1115 112 

Leu Val Ser Gin Leu Gly Ala Ala Gly Glu Val Phe Asn Phe Gly Asp 

1125 1130 1135 

Phe Gly Asp Glu Asp Asp His Ala Ala Ser Phe Gly Gly Leu Ala Ala 
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1140 1145 1150 

Ala Ala Gly Ala Ala Gly Val Ala Arg Lys Arg Ala Phe His Gly Asp 

1155 11-60 1165 

Asp Pro Phe Gly Glu Gly Pro Pro Glu Lys Lys Asp Leu Thr Leu Asp 
5 1170 1175 1180 

Met Leu 
1185 



10 



(2) INFORMATION FOR SEQ ID NO: 133: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1228 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single. 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

20 

Met Phe Cys Ala Ala Gly Gly Pro Thr Ser Pro Gly Gly Lys Ser Ala 

15 10 15 

Ala Arg Ala Ala Ser Gly Phe Phe Ala Pro His Asn Pro Arg Gly Ala 
20 25 30 

25 Thr Gin Thr Ala Pro Pro Pro Cys Arg Arg Gin Asn Phe Tyr Asn Pro 
35 40 45 

His Leu Ala Gin Thr Gly Thr Gin Pro Lys Ala Pro Gly Pro Ala Gin 

50 55 60 

Arg His Thr Tyr Tyr Ser Glu Cys Asp Glu Phe Arg Phe lie Ala Pro 
30 65 70 75 80 

Arg Ser Leu Asp Glu Asp Ala Pro Ala Glu Gin Arg Thr Gly Val His 

85 90 95 

Asp Gly Arg Leu Arg Arg Ala Pro Lys Val Tyr Cys Gly Gly Asp Glu 
100 105 110 

35 Arg Asp Val Leu Arg Val Gly Pro Glu Gly Phe Trp Pro Arg Arg Leu 
115 120 125 

Arg Leu Trp Gly Gly Ala Asp His Ala Pro Glu Gly Phe Asp Pro Thr 

130 135 140 

Val Thr Val Phe His Val Tyr Asp lie His Val Glu His Ala Tyr Ser 
40 145 150 155 160 

Met Arg Ala Ala Gin Leu His Glu Arg Phe Met Asp Ala lie Thr Pro 

165 170 175 

Ala Gly Thr Val lie Thr Leu Leu Gly Leu Thr Pro Glu Gly His Arg 
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180 185 190 

Val Ala Val His Val Tyr Gly Thr Arg Gin Tyr Phe Tyr Met Asn Lys 

195 200 205 

Ala Glu Val Asp Arg His Leu Gin Cys Arg Ala Pro Arg Asp Leu Cys 
5 210 215 220 

Glu Arg Leu Ala Ala Ala Leu Arg Glu Ser Pro Gly Ala Ser Phe Arg 
225 230 235 240 

Gly lie Ser Ala Asp His Phe Glu Ala Glu Val Val Glu Arg Ala Asp 
245 250 255 

10 Val Tyr Tyr Tyr Glu Trp Thr Leu Tyr Tyr Arg Val Phe Val Arg Ser 
260 265 270 

Gly Arg Ala Tyr Leu Cys Asp Asn Phe Cys Pro Ala lie Arg Lys Tyr 

275 280 285 

Glu Gly Gly Val Asp Ala Thr Thr Arg Phe lie Leu Asp Asn Pro Gly 
15 290 295 300 

Phe Val Thr Phe Gly Trp Tyr Arg Leu Lys Pro Gly Arg Gly Asn Ala 
305 310 315 320 

Pro Ala Gin Pro Arg Pro Pro Thr Ala Phe Gly Thr Ser Ser Asp Val 
325 330 335 

20 Glu Phe Asn Cys Thr Ala Asp Asn Leu Ala Val Glu Gly Ala Met Cys 
340 345 350 

Asp Leu Pro Ala Tyr Lys Leu Met Cys Phe Asp lie Glu Cys Lys Ala 

355 360 365 

Gly Gly Glu Asp Glu Leu Ala Phe Pro Val Ala Glu Arg Pro Glu Asp 
25 370 375 380 

Leu Val lie Gin lie Ser Cys Leu Leu Tyr Asp Leu Ser Thr Thr Ala 
385 390 395 400 

Leu Glu His lie Leu Leu Phe Ser Leu Gly Ser Cys Asp Leu Pro Glu 
405 410 415 

30 Ser His Leu Ser Asp Leu Ala Ser Arg Gly Leu Pro Ala Pro Val Val 
420 425 430 

Leu Glu Phe Asp Ser Glu Phe Glu Met Leu Leu Ala Phe Met Thr Phe 

435 440 445 

Val Lys Gin Tyr Gly Pro Glu Phe Val Thr Gly Tyr Asn lie lie Asn 
35 450 455 460 

Phe Asp Trp Pro Phe Val Leu Thr Lys Leu Thr Glu lie Tyr Lys Val 
465 470 475 480 

Pro Leu Asp Gly Tyr Gly Arg Met Asn Gly Arg Gly Val Phe Arg Val 
485 490 495 

40 Trp Asp He Gly Gin Ser His Phe Gin Lys Arg Ser Lys He Lys Val 
500 505 510 

Asn Gly Met Val Asn lie Asp Met Tyr Gly He He Thr Asp Lys Val 
515 520 525 
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Lys Leu Ser Ser Tyr Lys Leu Asn Ala Val Ala Glu Ala Val Leu Lys 

530 535 540 

Asp Lys Lys Lys Asp Leu Ser Tyr Arg Asp lie Pro Ala Tyr Tyr Ala 
545 550 555 560 

5 Ser Gly Pro Ala Gin Arg Gly Val lie Gly Glu Tyr Cys Val Gin Asp 

565 570 575 

Ser Leu Leu Val Gly Gin Leu Phe Phe Lys Phe Leu Pro His Leu Glu 

580 585 590 

Leu Ser Ala Val Ala Arg Leu Ala Gly lie Asn lie Thr Arg Thr lie 
10 595 600 605 

Tyr Asp Gly Gin Gin lie Arg Val Phe Thr Cys Leu Leu Arg Leu Ala 

610 615 620 

Gly Gin Lys Gly Phe lie Leu Pro Asp Thr Gin Gly Arg Phe Arg Gly 
625 630 635 640 

15 Leu Asp Lys Glu Ala Pro Lys Arg Pro Ala Val Pro Arg Gly Glu Gly 

645 650 655 

Glu Arg Pro Gly Asp Gly Asn Gly Asp Glu Asp Lys Asp Asp Asp Glu 

660 665 670 

Asp Gly Asp Glu Asp Gly Asp Glu Arg Glu Glu Val Ala Arg Glu Thr 
20 675 680 685 

Gly Gly Arg His Val Gly Tyr Gin Gly Ala Arg Val Leu Asp Pro Thr 

690 695 700 

Ser Gly Phe His Val Asp Pro Val Val Val Phe Asp Phe Ala Ser Leu 
705 710 715 720 

25 Tyr Pro Ser lie lie Gin Ala His Asn Leu Cys Phe Ser Thr Leu Ser 

725 730 735 

Leu Arg Pro Glu Ala Val Ala His. Leu Glu Ala Asp Arg Asp Tyr Leu 

740 745 750 

Glu He Glu Val Gly Gly Arg Arg Leu Phe Phe Val Lys Ala His Val 
30 755 760 765 

Arg Glu Ser Leu Leu Ser He Leu Leu Arg Asp Trp Leu Ala Met Arg 

770 775 780 

Lys Gin He Arg Ser Arg He Pro Gin Ser Thr Pro Glu Glu Ala Val 
785 790 795 800 

35 Leu Leu Asp Lys Gin Gin Ala Ala He Lys Val Val Cys Asn Ser Val 

805 810 815 

Tyr Gly Phe Thr Gly Val Gin His Gly Leu Leu Pro Cys Leu His Val 

820 825 830 

Ala Ala Thr Val Thr Thr lie Gly Arg Glu Met Leu Leu Ala Thr Arg 
40 835 840 845 

Ala Tyr Val His Ala Arg Trp Ala Glu Phe Asp Gin Leu Leu Ala Asp 

850 855 860 

Phe Pro Glu Ala Ala Gly Met Arg Ala Pro Gly Pro Tyr Ser Met Arg 
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865 870 875 880 

lie He Tyr Gly Asp Thr Asp Ser He Phe Val Leu Cys Arg Gly Leu 

885 - 890 895 

Thr Ala Ala Gly Leu Val Ala Met Gly Asp Lys Met Ala Ser His Arg 
5 900 905 910 

Ala Leu Phe Leu Pro Pro He Lys Leu Glu Cys Glu Lys Thr Phe Thr 

915 920 925 

Lys Leu Leu Leu He Ala Lys Lys Lys Tyr He Gly Val He Cys Gly 
930 935 940 

10 Gly Lys Met Leu He Lys Gly Val Asp Leu Val Arg Lys Asn Asn Cys 
945 950 955 960 

Ala Phe He Asn Arg Thr Ser Arg Ala Leu Val Asp Leu Leu Phe Tyr 

965 970 975 

Asp Asp Thr Val Ser Gly Ala Ala Ala Ala Glu Arg Pro Ala Glu Glu 
15 980 985 990 

Trp Leu Ala Arg Pro Leu Pro Glu Gly Leu Gin Ala Phe Gly Ala Val 

995 1000 1005 

Leu Val Asp Ala His Arg Arg He Thr Asp Pro Glu Arg Asp He Gin 
1010 1015 1020 

20 Asp Phe Val Leu Thr Ala Glu Leu Ser Arg His Pro Arg Ala Tyr Thr 
1025 1030 1035 104 

Asn Lys Arg Leu Ala His Leu Thr Val Tyr Tyr Lys Leu Met Ala Arg 

1045 1050 1055 

Arg Ala Gin Val Pro Ser He Lys Asp Arg He Pro Tyr Val He Val 
25 1060 1065 1070 

Ala Gin Thr Arg Glu Val Glu Glu Thr Val Ala Arg Leu Ala Ala Leu 

1075 1080 1085 

Arg Glu Leu Asp Ala Ala Ala Pro Gly Asp Glu Pro Ala Pro Pro Ala 
1090 1095 1100 

30 Ala Leu Pro Ser Pro Ala Lys Arg Pro Arg Glu Thr Pro Ser His Ala 
1105 1110 1115 112 

Asp Pro Pro Gly Gly Ala Ser Lys Pro Arg Lys Leu Leu Val Ser Glu 

1125 1130 H35 

Leu Ala Glu Asp Pro Gly Tyr Ala He Arg Val Pro Leu Asn Thr Asp 
35 1140 1145 H50 

Tyr Tyr Phe Ser His Leu Leu Gly Ala Ala Cys Val Thr Phe Lys Ala 

1155 1160 H65 

Leu Phe Gly Asn Asn Ala Lys He Thr Glu Ser Leu Leu Lys Arg Phe 
1170 1175 1180 

40 He Pro Glu Thr Trp His Pro Pro Asp Asp Val Ala Ala Arg Leu Arg 
1185 1190 1195 120 

Ala Ala Gly Phe Gly Pro Ala Gly Ala Gly Ala Thr Ala Glu Glu Thr 
1205 1210 1215 
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Arg Arg Met Leu His Arg Ala Phe Asp Thr Leu Ala 
1220 1225 

(2) INFORMATION FOR SEQ ID NO: 134: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

15 

Met Tyr Asp He Ala Pro Arg Arg Ser Gly Ser Arg Pro Gly Pro Gly 

1 5 10 15 

Arg Asp Lys Thr Arg Arg Arg Ser Arg Phe Ser Ala Ala Gly Asn Pro 
20 25 30 

20 Gly Val Glu Arg Arg Ala Ser Arg Lys Ser Leu Pro Ser His Ala Arg 
35 40 45 

Arg Leu Glu Leu Cys Leu His Glu Arg Arg Arg Tyr Arg Gly Phe Phe 

50 55 60 

Ala Ala Gin Thr Pro Ser Glu Glu He Ala He Val Arg Ser Leu Ser 
25 65 70 75 80 

Val Pro Leu Val Lys Thr Thr Pro Val Ser Leu Pro Phe Ser Leu Asp 

85 90 95 

Gin Thr Val Ala Asp Asn Cys Leu Thr Leu Ser Gly Met Gly Tyr Tyr 
100 105 110 

30 Leu Gly He Gly Gly Cys Cys Pro Ala Cys Ser Ala Gly Asp Gly Arg 
115 120 125 

Leu Ala Thr Val Ser Arg Glu Ala Leu lie Leu Ala Phe Val Gin Gin 

130 135 140 

lie Asn Thr lie Phe Glu His Arg Thr Phe Leu Ala Ser Leu Val Val 
35 145 150 155 160 

Leu Ala Asp Arg His Ser Thr Pro Leu Gin Asp Leu Leu Ala Asp Thr 

165 170 175 

Leu Gly Gin Pro Glu Leu Phe Phe Val His Thr lie Leu Arg Gly Gly 
180 185 190 

40 Gly Ala Cys Asp Pro Arg Phe Leu Phe Tyr Pro Asp Pro Thr Tyr Gly 
195 200 205 

Gly His Met Leu Tyr Val lie Phe Pro Gly Thr Ser Ala His Leu His 
210 215 220 
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Tyr Arg Leu lie Asp Arg Met Leu Thr Ala Cys Pro Gly Tyr Arg Phe 
225 230 235 240 

Ala Ala His Val Trp Gin Ser Thr Phe Val Leu Val Val Arg Arg Asn 
245 250 255 

5 Ala Glu Lys Pro Ala Asp Ala Glu lie Pro Thr Val Ser Ala Ala Asp 
260 265 270 

lie Tyr Cys Lys Met Arg Asp lie Ser Phe Asp Gly Gly Leu Met Leu 

275 280 285 

Glu Tyr Gin Arg Leu Tyr Ala Thr Phe Asp Glu Phe Pro Pro Pro 
10 290 295 300 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS : 
15 (A) LENGTH : 597 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135 : 

Val Arg Pro Ala Arg Pro Ala Met Ala Thr Ser . Ala Pro Gly Val Pro 
25 1 5 10 15 

Ser Ser Ala Ala Val Arg Glu Glu Ser Pro Gly Ser Ser Trp Lys Glu 

20 25 30 

Gly Ala Phe Glu Arg Pro Tyr Val Ala Phe Asp Pro Asp Leu Leu Ala 
35 40 45 

30 Leu Asn Glu Ala Leu Cys Ala Glu Leu Leu Ala Ala Cys His Val Val 
50 55 60 

Gly Val Pro Pro Ala Ser Ala Leu Asp Glu Asp Val Glu Ser Asp Val 
65 70 75 80 

Ala Pro Ala Pro Pro Arg Pro Arg Gly Ala Ala Arg Glu Ala Ser Gly 
35 85 90 95 

Gly Arg Gly Pro Gly Ser Arg Pro Pro Ala Asp Pro Thr Ala Glu Gly 

100 105 110 

Leu Leu Asp Thr Gly Pro Phe Ala Ala Ala Ser Val Asp Thr Phe Ala 
115 120 125 

40 Leu Asp Arg Pro Cys Leu Val Cys Arg Thr lie Glu Leu Tyr Lys Gin 
130 135 140 

Ala Tyr Arg Leu Ser Pro Gin Trp Val Ala Asp Tyr Ala Phe Leu Cys 
145 150 155 160 
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Ala Lys Cys Leu Gly Ala Pro His Cys Ala Ala Ser He Phe Val Ala 

165 170 175 

Ala Phe Glu Phe Val Tyr Val Met Asp His His Phe Leu Arg Thr Lys 
180 185 190 

5 Lys Ala Thr Leu Val Gly Ser Phe Ala Arg Phe Ala Leu Thr He Asn 
195 200 205 

Asp He His Arg His Phe Phe Leu His Cys Cys Phe Arg Thr Asp Gly 

210 215 220 

Gly Val Pro Gly Arg His Ala Gin Lys Gin Pro Arg Pro Thr Pro Ser 
10 225 230 235 240 

Pro Gly Ala Ala Lys Val Gin Tyr Ser Asn Tyr Ser Phe Leu Ala Gin 

245 250 255 

Ser Ala Thr Arg Ala Leu He Gly Thr Leu Ala Ser Gly Gly Asp Asp 
260 265 270 

15 Gly Ala Gly Ala Gly Gly Gly Ser Gly Thr Gin Pro Ser Leu Thr Thr 
275 280 285 

Ala Leu Met Asn Trp Lys Asp Cys Ala Arg Leu Leu Asp Cys Thr Glu 

290 295 300 

Gly Lys Arg Gly Gly Gly Asp Ser Cys Cys Thr Arg Ala Ala Ala Arg 
20 305 310 315 320 

Asn Gly Glu Phe Glu Ala Ala Ala Gly Ala Gin Gly Gly Glu Pro Glu 

325 330 335 

Thr Trp Ala Tyr Ala Asp Leu He Leu Leu Leu Leu Ala Gly Thr Pro 
340 345 350 

25 Ala Val Trp Glu Ser Gly Pro Arg Leu Arg Ala Ala Ala Asp Ala Arg 
355 360 365 

Arg Ala Ala Val Ser Glu Ser Trp Glu Ala His Arg Gly Ala Arg Met 

370 375 380 

Arg Asp Ala Ala Pro Arg Phe Ala Gin Phe Ala Glu Pro Lys Ala Gin 
30 385 390 395 400 

Pro Asp Leu Asp Leu Gly Pro Leu Met Ala Thr Val Leu Lys His Gly 

...405 410 415 

Arg Gly Arg Gly Arg Thr Gly Gly Glu Cys Leu Leu Cys Asn Leu Leu 
420 425 430 

35 Leu Val Arg Ala Tyr Trp Leu Ala Met Arg Arg Leu Arg Ala Ser Val 
435 440 445 

Val Arg Tyr Ser Glu Asn Asn Thr Ser Leu Phe Asp Cys He Val Pro 

450 455 460 

Val Val Asp Gin Leu Glu Ala Asp Pro Glu Ala Gin Pro Gly Asp Gly 
40 465 470 475 480 

Gly Arg Phe Val Ser Leu Leu Arg Ala Ala Gly Pro Glu Ala He Phe 

485 490 495 

Lys His Met Phe Cys Asp Pro Met Cys Ala lie Thr Glu Met Glu Val 
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500 505 510 - 

Asp Pro Trp Val Leu Phe Gly His Pro Arg Ala Asp His Arg Asp Glu 

515 520 525 

Leu Gin Leu His Lys Ala Lys Leu Ala Cys Gly Asn Glu Phe Glu Gly 
5 530 535 540 

Arg Val Cys lie Ala Leu Arg Ala Leu lie Tyr Thr Phe Lys Thr Tyr 
545 550 555 560 

Gin Val Phe Val Pro Lys Pro Thr Ala Thr Phe Val Arg Glu Ala Gly 
565 570 575 

10 Ala Leu Leu Arg Arg His Ser lie Ser Leu Leu Ser Leu Glu His Thr 
580 585 590 

Leu Cys Thr Tyr Val 
595 

15 (2) INFORMATION FOR SEQ ID NO: 136 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Met Ala Gly Arg Ala Gly Arg Trp Arg Thr Leu Arg Asp Ala He Pro 

15 10 15 

Asp Cys Ala Leu Arg Ser Gin Thr Leu Glu Ser Leu Asp Ala Arg Tyr 
30 20 25 30 

Val Ser Arg Asp Gly Ala Gly Asp Ala Ala Val Trp Phe Glu Asp Met 

35 40 45 

Thr Pro Ala Glu Leu Glu Val He Phe Pro Thr Thr Asp Ala Lys Leu 
50 55 60 

35 Asn Tyr Leu Ser Arg Thr Gin Arg Leu Ala Ser Leu Leu Thr Tyr Ala 
65 70 75 80 

Gly Pro He Lys Ala Pro Asp Gly Pro Ala Ala Pro His Thr Gin Asp 

85 90 95 

Thr Ala Cys Val His Gly Glu Leu Asp Ala Thr Glu Arg Glu Arg Phe 
40 100 105 110 

Ala Ala Val He Asn Arg Phe Leu Asp Leu His Gin He Leu Arg Gly 
115 120 125 
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(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 274 amino acids 

5 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Met Ala Gly Met Gly Lys Pro Tyr Gly Gly Arg Pro Gly Asp Ala Phe 
15 10 15 

15 Glu Gly Leu Val Gin Arg lie Arg Leu lie Val Pro Thr Thr Leu Arg 
20 25 30 

Gly Gly Gly Gly Glu Ser Gly Pro Tyr Ser Pro Ser Asn Pro Pro Ser 

35 40 45 

Arg Cys Ala Phe Gin Phe His Gly Gin Asp Gly Ser Asp Glu Ala Phe 
20 50 55 60 

Pro He Glu Tyr Val Leu Arg Leu Met Asn Asp Trp Ala Asp Val Pro 
65 70 75 80 

Cys Asn Pro Tyr Leu Arg Val Gin Asn Thr Gly Val Ser Val Leu Phe 
85 90 95 

25 Gin Gly Phe Phe Asn Arg Pro His Gly Ala Pro Gly Gly Ala He Thr 
100 105 110 

Ala Glu Gin Thr Asn Val He Leu His Ser Thr Glu Thr Thr Gly Leu 

115 120 125 

Ser Leu Gly Asp Leu Asp Asp Val Lys Gly Arg Leu Gly Leu Asp Ala 
30 130 135 140 

Arg Pro Met Met Ala Ser Met Trp He Ser Cys Phe Val Arg Met Pro 
145 150 155 160 

Arg Val Gin Leu Ala Phe Arg Phe Met Gly Pro Glu Asp Ala Val Arg 
165 170 175 

35 Thr Arg Arg He Leu Cys Arg Ala Ala Glu Gin Ala Arg Arg Arg Arg 
180 185 190 

Ser Arg Arg Ser Gin Asp Asp Tyr Gly Ala Val Ala Val Ala Ala Ala 

195 200 205 

His His Ser Ser Gly Ala Pro Gly Pro Gly Val Ala Ala Ser Gly Pro 
40 210 215 220 

Pro Ala Pro Pro Gly Arg Gly Pro Ala Arg Pro Trp His Gin Ala Val 
225 230 235 240 

Gin Leu Phe Arg Ala Pro Arg Pro Gly Pro Pro Ala Leu Leu Leu Leu 

403 



WO 98/20016 PCTVUS97/20016 



5 



245 250 255 

Val Ala Gly Leu Phe Leu Gly Ala Ala He Trp Trp Ala Val Gly Ala 
260 - 265 270 

Arg Leu 



(2) INFORMATION FOR SEQ ID NO: 138: 



(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 

Met Ala Ala Pro Gin Phe His Arg Pro Ser Thr He Thr Ala Asp Asn 
20 1 5 10 15 

Val Arg Ala Leu Gly Met Arg Gly Leu Val Leu Ala Thr Asn Asn Ala 

20 25 30 

Gin Phe He Met Asp Asn Ser Tyr Pro His Pro His Gly Thr Gin Gly 
35 40 45 

25 Ala Val Arg Glu Phe Leu Arg Gly Gin Ala Ala Ala Leu Thr Asp Leu 
50 55 60 

Gly Val Thr His Ala Asn Asn Thr Phe Ala Pro Gin Pro Met Phe Ala 
65 70 75 80 

Gly Asp Ala Ala Ala Glu Trp Leu Arg Pro Ser Phe Gly Leu Lys Arg 
30 85 90 95 

Thr Tyr Ser Pro Phe Val Val Arg Asp Pro Lys Thr Pro Ser Thr Pro 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 139: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 837 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
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CCCGCTAGTC 


TGGGGGCGAG 


GTGCTGCAGG 


ACCGAGTAGA 


GGATGGAAAA 


AACGTCTCGG 


60 


TCGTAAACCA 


CGACCGAGCG 


GGGTCCGATG 


CAGCCGTCGG 


GGCCGCTCTC 


GACGATGGCC 


120 


ACCAGCGGAC 


AGTCGGAGTT 


GTACGTGAGG 


TACACGCCCG 


GCGGGTAGCG 


GTACAGACCT 


180 


TCGGAGGTCG 


GGCGGCTGCA 


GTCGGGGCGG 


CGCAACTCAA 


GCTCCCCGCA 


CCGGTAGACC 


240 


GACGCAAAGA 


GTGTGGTGGC 


GATAATGAGC 


TCGCGAATAT 


ATCGCCAGGC 


GGCGCGCTGG 


300 


GTGGGCGTGA 


TTCCGGAAAC 


ACCGTCAAAA 


CAGTAGAACT 


TTTGAAACTC 


GCTGACGGCC 


360 


CAATCAGCGC 


CCGAACCCCC 


CGCGCCCATG 


ATGAAGCGGG 


CGAGTTCCTC 


CTTGAGGTGC 


420 


GGCAGGAGCC 


CCACGTTCTC 


GACGCTGTAG 


TACAGCGCGG 


TGTTGGGGGG 


CTGGGCGAAG 


480 


CTGTGGGTGG 


AGTGGTCGAA 


CAGGGGCCCG 


TTGACGAGCT 


CGAAGAAGCG 


ATGGGTGATG 


540 


CTGGGGAGCA 


GGGCCGGGTC 


CACCTGGTGG 


CGCAGCAGCG 


ACGCTCGCAT 


GAACCGGTGC 


600 


GCGTCAAACA 


CGCCCGGGGC 


GGCGCGGTTG 


TCGATGACCG 


TGCCCGCGCC 


CGCCGTCAGG 


660 


GCGCAGAAGC 


GCGCGCGCGC 


CGCGAAGCCG 


TTGGCGACCG 


CGGCGAAGGT 


CGCGGGCAGC 


720 


ACCTCGCCGT 


GGACGCTGAC 


CCGCAGCATC 


TTCTCGAGCT 


CCCCGCGCTG 


CTCGCGCACG 


780 


CAGCGCCCGA 


GGCTGGCCAG 


CGACCGCTTG 


GTCAGGCGGT 


CCGCGTACAG 


CCGCCG 


837 



(2) INFORMATION FOR SEQ ID NO: 140: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 278 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS : single 
( D ) TOPOLOGY : 1 inear 

25 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Arg Arg Leu Tyr Ala Asp Arg Leu Thr Lys Arg Ser Leu Ala Ser Leu 
30 1 5 10 15 

Gly Arg Cys Val Arg Glu Gin Arg Gly Glu Leu Glu Lys Met Leu Arg 

20 25 30 

Val Ser Val His Gly Glu Val Leu Pro Ala Thr Phe Ala Ala Val Ala 
35 40 45 

35 Asn Gly Phe Ala Ala Arg Ala Arg Phe Cys Ala Leu Thr Ala Gly Ala 

50 55 60 

Gly Thr Val He Asp Asn Arg Ala Ala Pro Gly Val Phe Asp Ala His 
65 70 75 80 

Arg Phe Met Arg Ala Ser Leu Leu Arg His Gin Val Asp Pro Ala Leu 
40 85 90 95 

Leu Pro Ser He Thr Phe Phe Glu Leu Val Asn Gly Pro Leu Phe Asp 

100 105 110 

His Ser Thr His Ser Phe Ala Gin Pro Pro Asn Thr Ala Leu Tyr Tyr 
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115 120 125 

Ser Val Glu Asn Val Gly Leu Leu Pro His Leu Lys Glu Glu Leu Ala 

130 135 140 

Arg Phe lie Met Gly Ala Gly Gly Ser Gly Ala Asp Trp Ala Val Ser 
5 145 150 155 160 

Glu Phe Gin Lys Phe Tyr Cys Phe Asp Gly Val Ser Gly lie Thr Pro 

165 170 175 

Thr Gin Arg Ala Ala Trp Arg Tyr lie Arg Glu Leu lie lie Ala Thr 
180 185 190 

10 Thr Leu Phe Ala Ser Val Tyr Arg Cys Gly Glu Leu Glu Leu Arg Arg 
195 200 205 

Pro Asp Cys Ser Arg Pro Thr Ser Glu Gly Arg Tyr Pro Pro Gly Val 

210 215 220 

Tyr Leu Thr Tyr Asn Ser Asp Cys Pro Leu Val Ala lie Val Glu Ser 
15 225 230 235 240 

Gly Pro Asp Gly Cys lie Gly Pro Arg Ser Val Val Val Tyr Asp Arg 

245 250 255 

Asp Val Phe Ser He Lys Val Leu Gin His Leu Ala Pro Arg Leu Ala 
260 265 270 

20 Gly Xaa Xaa Xaa Xaa Xaa 
275 



(2) INFORMATION FOR SEQ ID NO: 141: 



25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2646 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 



AAACAATACC 


AGAAGTCATG 


TGTATTTTTG 


AACATCGGTG 


TCTTTTTATT 


TATACACAAG 


60 


CCCAGCTCCC 


CTCCCCTCCC 


TTAGAGCTCG 


TCTTCGTCTC 


CGGCCTCGTC 


CTCGTTGTGG 


120 


AGCGGAGAGT 


ACCTGGCTTT 


GTTGCGCTTG 


CGCAGAACCA 


TGTTGGTGAC 


CTTGGAGCTG 


180 


AGCAGGGCGC 


TCGTGCCCTT 


CTTTCTGGCC 


TTGTGTTCCG 


TGCGCTCCAT 


GGCCGACACC 


240 


AAAGCCATAT 


ATCGGATCAT 


TTCTCGGGCC 


TCGGCCAACT 


TGGCCTCGTC 


AAACCCGCCC 


300 


CCCTCCGCGC 


CTTCCTCCCC 


CTCCCCGCCC 


ACGCCCCCGG 


GGTCGGAAGT 


CTTGAGTTCC 


360 


TTGGTGGTGA 


GCGGATACAG 


GGCCTTCATG 


GGATTGCGTT 


GCAGTTGCAG 


GACGTAGCGG 


420 


AAGGCGAAGA 


AGGCCGCGAC 


CAGGCCGGCC 


AGGACCAGCA 


GCCCCACGGC 


AAGCGCCCCG 


480 


AAGGGGTTGG 


ACATAAAGGA 


GGACACGCCC 


GAGACGGCCG 


ACACCACGCC 


CCCCACTACT 


540 


CCCATGACTA 


CCTTGCCGAC 


CGCGCGCCCC 


AAGTCCCCCA 


TCCCCTCGAA 


GAACGCGCAC 


600 
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AGCCCCGCGA 


ACATGGCGGC 


GTTGGCGTCG 


GCGCGGATGA 


CCGTGTCGAT 


GTCGGCAAAG 


660 




CGCAGGTCGT 


GCAGCTGGTT 


GCGGCGCTGG 


ACCTCCGTGT 


AGTCCAGCAG 


GCCGCTGTCC 


720 




TTGATCTCGT 


GGCGCGTGTA 


GACCTCCAGG 


GGCACAAACT 


CGTGGTCCTC 


CAGCATGGTG 


780 




ATGTTCAGGT 


CGATGAAGGT 


GCTGACGGTG 


GTGACGTCGG 


CGCGACTCAG 


CTGGTGAGAG 


840 


5 


TACGCGTACT. 


CCTCGAAGTA 


CACGTAGCCC 


CCGCCGAAGA 


TGAAGTAGCG 


CCGGTGGCCC 


900 




ACGGTGCACG 


GCTCGAGCGC 


GTCGCGGGTG 


AGGCGCAGCT 


CGTTGTTCTC 


GCCCAGCTGC 


960 




CCCTCGATCA 


GCGGGCCCTG 


GTCTTCGTAC 


CGAAAGCTGA 


CCAGGGGGCG 


GCTGTAGCAC 


1020 




GTCCCCGGCC 


GCGAGCTGAC 


GCGCATCGAG 


TTCTGCACGA 


TCACGTTGTC 


CGGGGCGACG 


1080 




GGCACGCACG 


TGGAGACGGC 


CATGACGTCT 


CCGAGCATGC 


GCGCGCTCAC 


CCGCCGGCCG 


1140 


10 


ACGGTGGCGG 


AGGCGATGGC 


GTTGGGGTTG 


AGCTTGCGGG 


CCTCGTTCCA 


GAGAGTCAGC 


1200 




TCGTGGTTCT 


GCAGCTCGCA 


CCACGCGACG 


GCGATGCGCC 


CCAGCATGTC 


GTTCACGTGG 


1260 




CGCTGTATGT 


GGTTATACGT 


AAACTGCAGC 


CGGGCGAACT 


CGATCGAGGA 


GGTGGTCTTG 


1320 




ATGCGCTCCA 


CGGACGCGTT 


GGCGCTGGGC 


GCCTCCCGCA 


GTGGCGCGGG 


CGTGGCATTC 


1380 




CGGGGCTTGC 


GGTCCTGCTC 


CCGCATGTAC 


TCCCGCACGT ACAGCTCGGC GAGCGTGTTG 


1440 


15 


CTGAGGAGGG 


GCTGGTACGC 


GATGAGGAAG 


CCCCCCGTGG 


CCAGGTAGTA 


CTGCGGCTGG 


1500 




CCCACCTTGA 


TGTGCGTGGC 


GTTGTACTTG 


CGCGCAAACA 


TGCGGTCGAT 


GGCCTCGCGG 


1560 




GCATCCCGGC 


CAATGCAGTC 


GCCCAGGTCG 


ACGCGCGAGA 


GCGAGTACTG 


GGTCAGGTTG 


1620 




GTGGTGAAGG 


TGGTCGAGAT 


GGCGTCGGAG 


GAGAAGCGGA 


AGGAGCCGCC 


GTACTCGGCG 


1680 




CGGAGCATCT 


CGTCCACCTC 


CTGCCACTTG 


GTCATGGTGC 


AGACCGCCGG 


TCGCTTCGGC 


1740 


20 


ACCCAGTCCC 


AGGCCACGGT 


AAACTTGGGG 


GTCGTCAGCA 


AGTTGCGGGT 


CGTCGGCGAC 


1800 




GTGGCCCGGG 


CCTTCGTGGT 


GAGGTCGCGC 


GCGTAGAAGC 


CGTCGACCTG 


CTTGAAGCGG 


1860 




TCGGCGGCGT 


AGCTGGTGTG 


CTCGGTGTGC 


GACCCCTCCC 


GGTAGCCGTA 


AAACGGGGAC 


1920 




ATGTACACAA 


AGTCGCCCGT 


CGCCAACACA 


AACTCATCGT 


ACGGGTACAC 


CGACCGCGCG 


1980 




TCCACCTCCT 


CGACGATGCA 


GTTGACCGTC 


GTGCCGTACC 


GATGGAACGC 


CTCCACCCGC 


2040 


25 


GAGGGGTTGT 


ACTTGAGGTC 


GGTGGTGTGC 


CACCCCCGGC 


TCGTGCGCGT 


GGCGACCTTC 


2100 




GCCGGCTTGA 


GCTCCATGTC 


GGTCTCGTGG 


TCGTCCCGGT 


GAAACGCGGT 


GGTCTCCATG 


2160 




TTGTTCCGCA 


CGTACTTGGC 


CGTGGAGCGG 


CAGACCCCCT 


TGGCGTTAAT 


CTTGTCGATC 


2220 




ACCTCCTCGA 


AGGGAACGGG 


GGCGCGGTCC 


TCGAATATCC 


CCATAAACTG 


GGAGTAGCGG 


2280 




TGGCCGAACC 


ACACCTGCGA 


CACGGTCACG 


TCTTTGTAGT 


ACATGGTGGC 


CTTGAATTTG 


2340 


30 


TACGGGGCGA 


TGTTCTCCTT 


GAAGACCACC 


GCGATGCCCT 


CCGTGTAGTT 


CTGCCCCTCC 


2400 




GGGCGCGTCG 


GGCAGCGGCG 


CGGCTGCTCA 


AACTGCACCA 


CCGTGGCGCC 


CGTCGGGGGC 


2460 




GGGCACACGT 


AAAACTGGGC 


ATCGGCGTTC 


TCGACCTTGA 


TTTCCCGCAG 


GTGCGCGCGC 


2520 




AGCGTGGCGT 


GGCCGGCGGC 


GACGGTCGCG 


TTGGCGTCGG 


GGGGCGGGGT 


CGCCTCGGGC 


2580 




CGCTTGGGCG 


GCTTTTTGGT 


TTTCCGCTTC 


CGGGCCTTGG 


TGGTCGCGGG 


GCTCGGGACG 


2640 


35 


GGGGG 












2646 



(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 846 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

5 

Pro Pro Val Pro Ser Pro Ala Thr Thr Lys Ala Arg Lys Arg Lys Thr 

15 10 15 

Lys Lys Pro Pro Lys Arg Pro Glu Ala Thr Pro Pro Pro Asp Ala Asn 
20 25 30 

10 Ala Thr Val Ala Ala Gly His Ala Thr Leu Arg Ala His Leu Arg Glu 
35 40 . 45 

lie Lys Val Glu Asn Ala Asp Ala Gin Phe Tyr Val Cys Pro Pro Pro 

50 55 60 

Thr Gly Ala Thr Val Val Gin Phe Glu Gin Pro Arg Arg Cys Pro Trp 
15 65 70 75 80 

Glu Gly Gin Asn Tyr Thr Glu Gly lie Ala Val Val Phe Lys Glu Asn 

85 90 95 

lie Ala Pro Tyr Lys Phe Lys Ala Thr Met Tyr Tyr Lys Asp Val Thr 
100 105 110 

20 Val Ser Gin Val Trp Phe Gly His Arg Tyr Ser Gin Phe Met Gly He 
115 120 125 

Phe Glu Asp Arg Ala Pro Val Pro Phe Glu Glu Val He Asp Lys He 

130. 135 140 

Asn Ala Lys Gly Val Cys Arg Ser Thr Ala Lys Tyr Val Arg Asn Asn 
25 145 150 155 160 

Met Thr Ala Phe His Arg Asp Asp His Glu Thr Asp Met Glu Leu Lys 

165 170 175 

Pro Ala Lys Val Ala Thr Arg Thr Ser Arg Gly Trp His Thr Thr Asp 
180 185 190 

30 Leu Lys Tyr Asn Pro Ser Arg Val Glu Ala Phe His Arg Tyr Gly Thr 
195 200 205 

Thr Val Asn Cys He Val Glu Glu Val Asp Ala Arg Ser Val Tyr Pro 

210 215 220 

Tyr Asp Glu Phe Val Leu Ala Thr Gly Asp Phe Val Tyr Met Ser Pro 
35 225 230 235 240 

Phe Tyr Gly Tyr Arg Glu Gly Ser His Thr Glu His Thr Ser Tyr Ala 

245 250 255 

Ala Asp Arg Phe Lys Gin Val Asp Gly Phe Tyr Ala Arg Asp Leu Thr 
260 265 270 

40 Thr Lys Ala Arg Ala Thr Ser Pro Thr Thr Arg Asn Leu Leu Thr Thr 
275 280 285 

Pro Lys Phe Thr Val Ala Trp Asp Trp Val Pro Lys Arg Pro Ala Val 
290 295 300 
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Cys Thr Met Thr Lys Trp Gin Glu Val Asp Glu Met Leu Arg Ala Glu 
305 310 315 320 

Tyr Gly Gly Ser Phe Arg Phe Ser Ser Asp Ala lie Ser Thr Thr Phe 
325 330 335 

5 Thr Thr Asn Leu Thr Gin Tyr Ser Leu Ser Arg Val Asp Leu Gly Asp 
340 345 350 

Cys lie Gly Arg Asp Ala Arg Glu Ala lie Asp Arg Met Phe Ala Arg 

355 360 365 

Lys Tyr Asn Ala Thr His He Lys Val Gly Gin Pro Gin Tyr Tyr Leu 
10 370 375 380 

Ala Thr Gly Gly Phe Leu He Ala Tyr Gin Pro Leu Leu Ser Asn Thr 
385 390 395 400 

Leu Ala Glu Leu Tyr Val Arg Glu Tyr Met Arg Glu Gin Asp Arg Lys 
405 410 415 

15 Pro Arg Asn Ala Thr Pro Ala Pro Leu Arg Glu Ala Pro Ser Ala Asn 
420 425 430 

Ala Ser Val Glu Arg He Lys Thr Thr Ser Ser He Glu Phe Ala Arg 

435 440 445 

Leu Gin Phe Thr Tyr Asn His He Gin Arg His Val Asn Asp Met Leu 
20 450 455 460 

Gly Arg lie Ala Val Ala Trp Cys Glu Leu Gin Asn His Glu Leu Thr 
465 470 475 480 

Leu Trp Asn Glu Ala Arg Lys Leu Asn Pro Asn Ala He Ala Ser Ala 
485 490 495 

25 Thr Val Gly Arg Arg Val Ser Ala Arg Met Leu Gly Asp Val Met Ala 
500 505 510 

Val Ser Thr .Cys Val Pro Val Ala Pro Asp Asn Val He Val Gin Asn 

515 520 525 

Ser Met Arg Val Ser Ser Arg Pro Gly Thr Cys Arg Pro Leu Val Ser 
30 530 535 540 

Phe Arg Tyr Glu Asp Gin Gly Pro Leu He Glu Gly Gin Leu Gly Glu 
545 550 555 560 

Asn Asn Glu Leu Arg Leu Thr Arg Asp Ala Leu Glu Pro Cys Thr Val 
565 570 575 

35 Gly His Arg Arg Tyr Phe He Phe Gly Gly Gly Tyr Val Tyr Phe Glu 
580 585 590 

Glu Tyr Ala Tyr Ser His Gin Leu Ser Arg Ala Asp Val Thr Thr Val 

595 600 605 

Ser Thr Phe He Asp Leu Asn lie Thr Met Leu Glu Asp His Glu Phe 
40 610 615 620 

Val Pro Leu Glu Val Tyr Thr Arg His Glu He Lys Asp Ser Gly Leu 
625 630 635 640 

Leu Asp Tyr Thr Glu Val Gin Arg Arg Asn Gin Leu His Asp Leu Arg 
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645 650 655 

Phe Ala Asp lie Asp Thr Val lie Arg Ala Asp Ala Asn Ala Ala Met 

660 - 665 670 

Phe Ala Gly Leu Cys Ala Phe Phe Glu Gly Met Gly Asp Leu Gly Arg 
5 675 680 685 

Ala Val Gly Lys Val Val Met Gly Val Val Gly Gly Val Val Ser Ala 

690 695 700 

Val Ser Gly Val Ser Ser Phe Met Ser Asn Pro Phe Gly Ala Val Gly 
705 710 715 720 

10 Leu Leu Val Leu Ala Gly .Leu Val Ala Ala Phe Phe Ala Phe Arg Tyr 

725 730 735 

Val Leu Gin Leu Gin Arg Asn Pro Met Lys Ala Leu Tyr Pro Leu Thr 

740 745 750 

Thr Lys Glu Leu Lys Thr Ser Asp Pro Gly Gly Val Gly Gly Glu Gly 
15 755 760 765 

Glu Glu Gly Ala Glu Gly Gly Gly Phe Asp Glu Ala Lys Leu Ala Glu 

770 775 780 

Ala Arg Glu Met lie Arg Tyr Met Ala Leu Val Ser Ala Met Glu Arg 
785 790 795 800 

20 Thr Glu His Lys Ala Arg Lys Lys Gly Thr Ser Ala Leu Leu Ser Ser 

805 810 815 

Lys Val Thr Asn Met Val Leu Arg Lys Arg Asn Lys Ala Arg Tyr Ser 

820 825 830 

Pro Leu His Asn Glu Asp Glu Ala Gly Asp Glu Asp Glu Leu 
25 835 840 845 

(2) INFORMATION FOR SEQ ID NO: 143 : 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 20388 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 



GGATCTCCTC GTTCTCTTGC GTGATGGACA CGTCCTCCGC GGTGGCCGTG TCGCCTCCCG 60 

GGGCCGTGAG CTGCTCCTCC GGGGAGATGG GGGGGTCTGG GGTGCCGACA ACGGCCGGCC 120 

40 CGGCCCCGCC CGAGACCGAG GACGCCTGGG GAGTGGGGGT GCCGCTTTCC CCCATCCCCA 180 

GGGACAGGTG GGCCGCCGCC TCCGTCGCGG CGGCGGGAGC CGCGGCCCCC AGCCGCGCGA 240 

CGTAGCGACA AAAGTGGCGA CAGAGGCGCA TGAGGCGCGC GCCGTCGGCC GCGTATCGCG 300 

TGTTTGGCGG GACGAGCTCG TCGTAACTGA ACAGGAGCAC GCGGGCGCAG GTCGCCCACG 360 
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GGCCCCACGC 


CAGGCGCAGC 


GCCGCGACCG 


TGTACGGGTC 


GTACACGCCT 


TGGGCGTCGC 


420 


ACGCGACCGG 


CAGGGAGACG 


AACAGCCCGC 


CCGCGCTGGG 


GACGCGCGGC 


AGGAGGTCCG 


480 


GGTGCGCCGG 


GATGACGGGG 


GCTAGGATCG 


CCCCCACCGC 


ATCCGCCGGC 


ACGTAGGCGG 


540 


CAAACGCCGA 


ACGCCACGGG 


GTGCAGTCGC 


CGGTCGCGTG 


GGCCCGGGTC 


TGGGTTTCGA 


600 


CCCGGAAGTT 


CGCGGCCGCC 


CCGCCGTCGG 


GGCGGCCGCG 


CACGAGGGCG 


GACAGCGGGA 


660 


CCCCCGCCGC 


CGCCAGGCAC 


TCGCTGGAGA 


TGATGACGTG 


AATCAGCGAG 


GCGGGGCTGC 


720 


TCGGGTCCCG 


GGTGAGATCG 


TATTGGACCT 


CGTTGGCAAA 


GTGCGCGTTC 


ATGGCCCGGC 


780 


CGGCGGTGCG 


AGCCCTTCCC 


GGTGCCGGAA 


GGGGCGTGGG 


TGGGGGGTGC 


GTGTGCGCGT 


840 


CCTCGGGGCC 


CGCGGGCGCA 


CGTGCGCTTA 


TACGCTGTGT 


GTTTCGTCTG 


TCCCCAGGGA 


900 


ATCCGGGGCC 


AGGACTTTAA 


CCTGCTTTTC 


GTCGACGAGG 


CCAACTTTAT 


TCGCCCGGAT 


960 


GCGGTCCAGA 


CGATTATGGG 


CTTTCTCAAT 


CAGGCCAACT 


GCAAGATCAT 


CTTCGTCTCG 


1020 


TCGACCAACA 


CCGGGAAGGC 


CAGCACGAGC 


TTTTTGTACA 


ACCTCCGCGG 


GGCCGCCGAC 


1080 


GAGCTGCTCA 


. ACGTGGTCAC 


CTATATATGC 


GACGACCACA 


TGCCGCGGGT 


GGTGACGCAC 


1140 


ACCAACGCCA 


CGGCCTGTTC 


CTGCTATATC 


CTGAACAAAC 


CCGTGTTTAT 


CACGATGGAC 


1200 


GGCGCCGTTC 


GCCGGACGGC 


CGATCTGTTT 


CTGCCCGACT 


CCTTCATGCA 


GGAGATCATC 


1260 


GGGGGGCAGG 


CCCGCGAGAC 


CGGCGACGAC 


CGGCCCGTCC 


TAACAAAGTC 


GGCGGGGGAG 


1320 


CGGTTTCTGC 


TGTACCGCCC 


CTCCACCACC 


ACCAACAGCG 


GCCTGATGGC 


CCCCGAGCTG 


1380 


TACGTGTACG 


TGGACCCGGC 


GTTCACGGCC 


AACACGCGCG 


CCTCCGGCAC 


CGGCATCGCG 


1440 


GTCGTCGGGA 


GGTACCGCGA 


CGATTTCATT 


ATCTTCGCCC 


TGGAGCACTT 


TTTCCTCCGC 


1500 


GCGCTCACGG 


GATCGGCCCC 


CGCGGACATC 


GCCCGCTGCG 


TCGTGCACAG 


CCTCGCCCAG 


1560 


GTGCTGGCGC 


TGCACCCCGG 


GGCGTTTCGC 


AGCGTTCGCG 


TGGCGGTCGA 


GGGCAACAGC 


1620 


AGCCAGGACT 


CGGCCGTGGC 


CATCGCCACA 


CACGTGCATA 


CCGAGATGCA 


CCGCATCCTG 


1680 


GCCTCGGCGG 


GGGCCAACGG 


CCCGGGGCCC 


GAGCTCCTCT 


TCTATCACTG 


CGAGCCGCCC 


1740 


GGCGGCGCGG 


TATTGTACCC 


CTTCTTTCTG 


CTCAACAAAC 


AGAAGACGCC 


CGCCTTCGAA 


1800 


TACTTTATCA 


AAAAGTTCAA 


CTCCGGGGGC 


GTCATGGCGT 


CCCAGGAGCT 


CGTCTCCGTG 


1860 


ACGGTGCGCC 


TGCAGACCGA 


CCCGGTCGAG 


TATCTGTCCG 


AGCAGCTCAA 


CAACCTCATC 


1920 


GAAACCGTCT 


CTCCCAACAC 


CGACGTCCGC 


ATGTACTCCG 


GAAAACGCAA 


CGGTGCCGCG 


1980 


GACGACCTCA 


TGGTCGCGGT 


CATCATGGCC 


ATTTACCTGG 


CGGCCCCGAC 


CGGGATCCCC 


2040 


CCGGCCTTTT 


TTCCGATCAC 


GCGCACGTCT 


TGAGTCTTTC 


TTGCCGTTTC 


TTTTGTTTCT 


2100 


CTTTCTTTCC 


CCCCTCTCTC 


CGCAATAAAC 


GCCTTCCCGG 


AACTGTGTTT 


TCCCCCCCTA 


2160 


CAACAGTGTT 


GTCCGTTGGT 


TGGGTGGTTG 


GGGTGCGGGG 


GTGGGCGGGG 


GAAGCAAGAA 


2220 


AACGGTCGGC 


GAACACAACA 


TCGGGAAAAC 


GGATTCCCGA 


ACGTGCGTCT 


TCCCAGATTC 


2280 


GACACACACC 


CCCCTTCTCC 


TTAAATAAAC 


ACAAACCACA 


CGCTCGTTGG 


TTGGTTAATG 


2340 


CCGGCGCTTT 


ATTTACGTCT 


TGTTTTTTTG 


CGTTTCCTCC 


GCGGGTCCCT 


TCCCAACACG 


2400 


CCTGCCCCCG 


CCTCAGGGGT 


AGCGGATAAC 


CGGGGCCATG 


TCGCCGGATT 


GCACAACGGC 


2460 


GGCGCCGTCG 


AACGTACACA 


CCCGAACCGC 


CGGGGCCAGG 


GCCAGGATGT 


CCCCGAGTTG 


2520 


GCCCGCGTGC 


GCCAGCCAGG 


CGACCAGCGC 


CTCGTAAAGC 


GGCAGCCTGC 


GCTCGCCGTC 


2580 


CTGCATCAGC 


ATGGGGGCTT 


CGGGGTGGAT 


GAGCTGGGCG 


GCTTCTCGCG 


TGACGCTCTG 


2640 


CATCTGCAGG 


AGCGCGTTCA 


CGTATCCGTC 


CTGGGCGCTC 


AGCGCGAGCA 


GCCGGGGGAT 


2700 


GAGCGTGAGG 


ATGAGGGTGG 


TTCCTTCGGT 


TATGGAGTAG 


ACCATGTTGA 


GGACGAGCGA 


2760 


CCGCAGCTCG 


GTGTTTACGG 


AGGCGAGTTG 


CTGGACGTCG 


GCCACGAGCG 


AGAGACGGGC 


2820 


CCCGTTGTAA 


TACAGCACGT 


TGAGGTCGGG 


GAGCTCCCCG 


GGCGTCCGGG 


GGTCGGGGTT 


2880 


GAGGTCCCGG 


ATGCCCCGGG 


CGACCAGCCG 


CGCGACTATC 


TCGCGGGCCA 


GGGGCGTTGG 


2940 
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GAGCGGGACC 


GGAAACCGCA 


GCGTGAGGTC 


CAGCGACTCC 


AGGCGCACGT 


CCGTCGCCTG 


3000 


GCCCTCGAAG 


ACGGGCGGGA 


CGAGGCTGAC 


GGGATCCCCG 


TTGCAGAGGT 


CGACGGGGGA 


3060 


GGTGTTGCGG 


AGATTGACGG 


TGCCGGCGTG 


CGTGAGCCCC 


AGGTCCACGG 


GGCAGGCGAC 


3120 


GATTCGCGTG 


GGCAGCACCC 


GCGTGATTAC 


CGCGGGGAAG 


CGCCTGCGGT 


ACGCCAGCAA 


3180 


CAACCCCAAC 


GTGTCGGGAC 


TAACTCCTCC 


GGAGACGAAC 


GATTCGTGCG 


CCACGTCCGC 


3240 


GAGCGCCAGC 


TGGCGGCGGA 


TGGTCGGCAG 


AAAGACCACT 


CGACCCTCGC 


ACCGCTGCAG 


3300 


CGCCGCGGCA 


TCGGGGCGCG 


AGATACCCGA 


GGGGATCGCG 


ATGTCTGCTT 


CGAAACAATC 


3360 


CGTGATCATG 


GCGCCGGGCC 


GCGAGACACC 


GGAACGCGGG 


GGTGCGGGAG 


GGCCGGAAAG 


3420 


CGCAACGCAA 


CCGGGACGAT 


GATGAAACAG 


AGATGGGGGG 


CACCGACCGT 


GTGGGAGAGG 


3480 


GGGCGGGGCA 


GGGCTCAGCA 


GCACGCACGG 


GGAGGTCTGT 


CGTGCGCAGG 


AGCCCCAGGT 


3540 


GAGAATCAGT 


CCCCCGGAGC 


TCGGGTCTGG 


GTTTTATTGG 


GACCTGCCCT 


CGGAATCGCG 


3600 


GCTCCCAGTC 


CAAGCCCCCC 


CGGGGGGGCG 


GGGACAGGGG 


GTGTGTGTGG 


GTAAAAGCAA 


3660 


CGTCGGAAAA 


TCAAACCCAA 


TGCCCCAAAC 


AGGAAAAAAA 


AAAAAGACGG 


GCGGGTGGAG 


3720 


GGAAAGCTGG 


GGAAGAAGAA 


GCCAATTTTA 


CAGAGACAGG 


CCCTTTAGCG 


GGGAGGCGTC 


3780 


GTAGATGAGA 


TACTGCGTAA 


AGTGGGTCTC 


TCGCGCGTGG 


GCCTCCCCAT 


CGCGGGCGCT 


3840 


GCGTAGCAGG 


GCGGGGTCGC 


TGGCGCAGGT 


GATCGGGTAG 


GCTTCCTGAA 


ACAGGCCGCA 


3900 


CGGGTCTTCC 


ACGAGCTCGC 


GGCACCCCGG 


CGGGCGCTTA 


AACTGCACGT 


CGCTGGCAGC 


3960 


GGTGGCCGTG 


GATACCGCCG 


ATCCCGTTTC 


CACGATAAGA 


CGCTCCAGGC 


AGCGATGTTT 


4020 


GGCCGTGATG 


TCGGCCGCGG 


TGAAGAACTT 


GAAGCAGGGG 


CTGAGGACGG 


GCGAGGCCCC 


4080 


GTTGAGGTGA 


TAGGCCCCGT 


TGTACAGCAG 


GTCCCCGTAC 


GAGAACCGCT 


GCGACGCCCA 


4140 


CGGGTTGGCC 


GTGGCCGCGA 


AGGGCCGCGC 


CGGGTCGCTC 


TGGCCGTGGT 


CGTACATGAG 


4200 


GGCTATGACG 


TCCCCCTCCT 


TGTCCCCCGC 


GTACACGCCG 


CCGGCCGCGC 


GTCCCCGCGG 


4260 


GTTGCAGGGC 


CGGCGAAAGT 


AGTTGATGTC 


CGTGGCCACG 


GGGGTGGCGA 


TGAACTCACA 


4320 


CACGGCATCC 


TGCCCGTGGT 


CCATGCCGGC 


GCGCCGCGGC 


ACCTGGGCGC 


AGCCAAAGAC 


4380 


CGGGAGGGGC 


TGGGCCGGCC 


CCAGCCGGTT 


TCCCGCCACG 


ACCGCGTTGC 


GCAGGTACAC 


4440 


GGCGGCCGCG 


TTGTCTAGCA 


GCGGGGGGGC 


CCCGCGGCCG 


AGGTAAAAGT 


TTTGGGGGAG 


4500 


GTTGCCCATG 


TCCGTAACGG 


GGTTGCGGAC 


GGTGGCCGTG 


GCCGCGACGG 


CGGTGTAGCC 


4560 


CACACCCAGG 


TCCACGTTTC 


CGCGCGGCTG 


GGTGAGCGTG 


AAGCTGACCC 


CCCCGCCCGT 


4620 


TTCGTGGCGG 


GCCACCTGGA 


GCTGGCCCAG 


AAAGTACGCC 


TCCGACGCGC 


GCTCGGAAAA 


4680 


CAGCACGTTC 


TCGGTCACGA 


AGCGGTCCTG 


CCGCACGACG 


GTGAACCCGA 


ACCCGGGGTG 


4740 


GAGGCCCGTC 


TTGAGCTGGT 


GATACAGGGC 


CACGGGGCTC 


ATCTTGAAGT 


ACCCCGCCAT 


4800 


GAGCGCGTAG 


GTCAGCGCGT 


TCTCCCCCGC 


CGCGCTCTCG 


CGGGCGTGCT 


GCACCACGGG 


4860 


CTGGCGGATG 


GAGGAGAAGT 


AGTTGGCCCC 


CAGGGCCGGG 


GGGACCAGGG 


GGACGTGGCG 


4920 


CGCCAGGTCG 


CGCAGGGCCG 


GGGGGAAGTT 


GGGCGCGTTG 


GCCACGTGGT 


CGGCGCCCGC 


4980 


AAACAGCGCG 


TGGACGGGCA 


GGACGTAGAA 


GTATTCGCCA 


TTTTGGATGG 


TGTGGTCCAG 


5040 


GTGCTGGGGG 


GCCATGAGCA 


GCACGCCGGC 


GTGCAGCGCC 


CCGTCGAAGA 


TGCGCATGTT 


5100 


GGCCGTCGAC 


GCGGTGTTGG 


CGCCCGCGTC 


GGGCGCCGCG 


GAGCACAGCA 


GCGCCGTCGT 


5160 


GCGCTCGGCC 


ATGTTGTGCG 


CCAGCACCTG 


CAGCGTGAGC 


ATGGCGGGCC 


CGTCGACGAC 


5220 


GACGCGCCCG 


TTGTGGAACA 


TGGCGTTGAC 


CGTGTTGGCC 


ACCAGATTGG 


CGGGATGCAG 


5280 


CGGGTGGGCG 


GGGTCGGTCA 


CGGGATCGCT 


CGGGCACTCC 


TCGCCGGGGG 


CGATCTCCGG 


5340 


GACCACCATG 


TTCTGCAGCG 


TGGCGTACAC 


GCGGTCGAAG 


CGGACCCCCG 


CGGTGCAGCA 


5400 


GCGCCCCCGC 


GAGAAGGCCG 


GCACCAGCAC 


GTAATAGTAG 


ATTTTGTGGT 


GGACGGTCCA 


5460 


GTCGGCCGGC 


CGGTGCGGCC 


GGTCGTCGGC 


GGCGTCGGCC 


GCGCGGGCCT 


GGGTGTTGTG 


5520 
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CAGCAGCCGG 


CCGTCGTTGC 


GGTTAAAGTC 


GGCCGTCGCC 


ACGTTGCACG 


CCGCCGCGTA 


5580 


GACGGGCTCG 


TGTCCCCCCG 


CGTCAATCCG 


GCAGTCTCGG 


TGGCGGTCCA 


GGGCCGCGTG 


5640 


TCGCATAAGG 


CCGTCGCAGT 


CCCACAC6AG 


GGGCGGCAGC 


AGCGCCGGGT 


CGCGCATCAG 


5700 


GTGATTCAGC 


TCGGCCTGAG 


CCTGCCCGCC 


CAGCTCCGGG 


CCCGGCAGGG 


TAAAGTCGTC 


5760 


CACCAGCTGG 


GCCAGGGCCT 


CGACGTGGGC 


CACCAGGTCC 


CGATACACGG 


CCATGCACTC 


5820 


CTCGGGGAGG 


TCGCCCCCGA 


GGTAGGTCAC 


GATGTACGAG 


ACCAGCGAGT 


AGTCGTTCAC 


5880 


GAACGCCGCG 


CATCGCGTGT 


TGTTCCAGTA 


GCTGGTGATG 


CACTGAGTCA 


CGAGCCGCGC 


5940 


CAGGGCGCAG 


AACACGTGCT 


CGCTGCCGTG 


AATCGCGGCT 


TGCAGCAGGT 


AAAACACCGC 


6000 


CGGGTAGCTG 


CGGTCCTCGA 


ACGCCCCGCG 


GACGGCGGCT 


ATGGTAGCCG 


GCGCCATGGC 


6060 


GTGGCGGCCA 


ACGCCGAGCT 


CCAGGCCCCG 


GGCGTCACGA 


AACGCCACCG 


GACACAGCGC 


6120 


CAGGGGCAGG 


TTGCCGTTGA 


CCACGCGCCA 


GGTGGCCTGG 


ATCGCCCCCG 


GACCGGCCGG 


6180 


GGGGACTTCG 


CCGCCGGGAA 


GCTCGACGTC 


GGCCACGCCC 


GCGAAGAAGT 


CGAACGC GGG 


6240 


GTGCAGCTCC 


AGAGCCAGGT 


TGGCGTTGTC 


GGGCTGCATG 


AACTGCTCCG 


CGGTCATCTG 


6300 


GCACTCGGCG 


ACCCACCGGA 


CCCGGCCGTG 


GGCGAGGCGC 


TGCCGCCAGG 


CGTTCAGAAA 


6360 


ACGCTGCTGC 


ATGTCCGCGC 


CGGGGCCGGC 


CGGGGCCGCG 


ACGTACGCCC 


CGTACGGATT 


6420 


CGCGGCCTCG 


ACGGGGTCGT 


GGTTCACGCC 


CCCGACGGCC 


GCGTCGATGT 


TCATGAGCGA 


6480 


AGGATGACAC 


ACGGTCCCGA 


CCGCGTTCTC 


CATGGACAGC 


CGCAGAACCT 


GGTGGTCCTT 


6540 


TCCCCAAAAA 


AACAGCTGCC 


GGGGAGGGAA 


CGCGCGGGGC 


TCCGGGTGGC 


CGGGGGCGGG 


6600 


CACCAGGTCC 


CCGGCGTGCG 


CGGCGAAGCG 


CTCCATGGCC 


GGGTTGAACA 


GCCCCAGGGG 


6660 


CAGGACGAAC 


GTCAGGTCCA 


TGGCGCCCAC 


CAGGGGGTAG 


GGCACGTTGG 


TGGCGGCGTA 


6720 


GATGCGCTTC 


TCCAGGGCCT 


CCAGGAAGAC 


CAGCCTGTCG 


CCTATGGCCA 


CCAGATCCGC 


6780 


GCGCACGCGC 


GTTGTCTGGG 


GGGCGCTTTC 


GAGTTCATCC 


AGCGTCTCCC 


GGTTCGCCTC 


6840 


GAGTTGCTCC 


TCCTGCATCT 


CCAGCAGGTG 


GCGGCCCACG 


TCGTCCAGGC 


TCCGCACGGC 


6900 


CTTGCCCATC 


ACCAGCGCCG 


TGACGAGGTT 


GGCCCCGTTC 


AAGACCATCT 


CGCCGTAGGT 


6960 


CACCGGCACG 


TCGGCCTCGG 


TGTCCTCCAC 


CTTCAGGAAG 


GACTGCAGGA 


GGCGCTGTTT 


7020 


GATGGCGGCG 


GTGGTGACCA 


GCACCCCGTC 


GACCGGCCGC 


CCGCGCGTGT 


CGGCGTGCGT 


7080 


CAGGCGGGGC 


ACGGCCACGG 


ACGGCTGCGT 


CGCCGTGGTC 


AGGTCCACGA 


GCCAGGCCTC 


7140 


GATGGCCTCG 


CGGCGATGGC 


CCGCCTTGCC 


CAGGAAGAAG 


CTCGTGTCGC 


AAAAGCTCCG 


7200 


CTTCAGCTCG 


GCGACCAGGG 


TCGCCCGGGC 


GACCCTGGTC 


GCCAGGCGCC 


CGTTGTCGAG 


7260 


ATATCGTTGC 


ATGGGCAACA 


GCAGGGCCAG 


GGGAGGCGCC 


TTCTCCAACA 


GCACGTGCAG 


7320 


CATCTGGTCG 


GCCGTGCCGC 


GCTCAAACGC 


CCCCAGGACG 


GCCTGGACGT 


TGCGCGCGAG 


7380 


CTGCTGGATG 


GCGCGCAGCT 


GGCGATGCAG 


GCTAATGCCC 


GTCCCGTCCA 


GGGCCTCCCC 


7440 


CGTGAGCAGG 


GCAATGGCCT 


CGGTGGCCAG 


GCTGAAGGCG 


GCGTTCAGGG 


CCCGGCGGTC 


7500 


GATGACCTTC 


GTCATGTAAT 


TATGCACGGG 


CTGCTCGACG 


GGGTGCGGGC 


CGTCGCGGGC 


7560 


GATGAGGGGC 


TGGTGGACCT 


CGAACTGCAC 


ACGCCCTTCG 


TTCATGTAAG 


CCAGCTCCGG 


7620 


GAACTTGGTG 


CACACGCACG 


CCACGGACAG 


GCCGAGCTCC 


AGAAAGCGCA 


CGAGCGACAG 


7680 


GGTGTTGCAG 


TAGGACCCCA 


GCAGGGCGTC 


AAACTCTACG 


TCATACAGGC 


TGTTTTCGTC 


7740 


GGAGCGCACG 


CGGGCGAAAA 


AATCAAAGAG 


TCTGCGGTGG 


GACGCCACCT 


CGATCGTACT 


7800 


CAGGATGGAG 


CCGGTGGGCA 


CCATGGCCGC 


GGCGTACCGG 


TAACCCGGGG 


GGTCGCGGGC 


7860 


AGGAGCGGCC 




TTGGGGGATT 


CGCAGGCTCC 


ATCAAGCCAA 


GCTCGGGAAG 


7920 


GCCAAGCCCC 


TCCCACACAA 


CGCCTCACCG 


CCGGCGGACG 


CGACTAACAA 


CCCACGGGCC 


7980 


GCCAAAAACC 


CCAAGGGGCA 


ACCCGACCAA 


CAACAGGCGA 


GGGGAGG AAA 


GGCGTAAAGG 


8040 


GGGCGTTGGG 


AGGCAAAAAG 


AAAGAAAACA 


CCCAGACGTA 


GGCCCGAGGA 


CCGGCCGGCG 


8100 
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TCCTCTGTCC 


CCGAGCACCC 


ACTGTGCCCA 


ACAGGCACGG 


GGGCGAGCTG 


CCCCTGCCTT 


8160 


ATATACCCCC 


CCGCCACACC 


CCCGTTAGAA 


CGCGACGGGT 


GCCTTCAAGA 


TGGCCCTGGT 


8220 


CCAAAAGCGT 


GCTAGAAAAA 


AGTTGGTAAA 


GGCGGCAAAG 


CAGTCCGCCG 


CCGCCACCCA 


8280 


CATGGCGGCG 


CCGGCCGCGC 


AGGCGATTCC 


CAGAGAACGG 


GCGCGGAGGG 


GATCCGTGCG 


8340 


GGGCAGCAGC 


TGGCTGGCGG 


TGATCCAATG 


GAAAAGCCCG 


TCGGGACTGA 


ACGTCTCATG 


8400 


GGCGGCCGCC 


ACCAGGGCGC 


ACAGGGCCGC 


GCCGCCCATG 


ATCACGCACA 


ACCCCCAAAA 


8460 


CACGGGTGGC 


GACAACGGCA 


GGCGATCCCG 


TTTGATGTTC 


ACGTACAGGA 


GGAGCGCCCG 


8520 


TGCCAGCCAC 


GTGACATAGT 


AGGCGAGGAC 


GGCGGCTATA 


ATACATGCCG 


GCGCCACCGC 


8580 


CCGTCCGGTC 


CACCCGTAAT 


ACATGCCCGC 


GGCCAGCAGC 


TCCAGCGGCT 


TGAGGACCAG 


8640 


GAACGACCAA 


GCAAACATCA 


CCACCCGCTT 


GGAAAAGACC 


GGCTGGGTGT 


GGGGCGGAAG 


8700 


ACGCGAGTAG 


GCCGAACTGA 


CAAAAAAATC 


AGACGTGCCG 


TACGAGGACA 


GCGAAAACTG 


8760 


TTCATCGAGC 


GGCAGTTCGC 


CGTCCTCCCC 


GCCACACGCG 


GCCTCGTATA 


CCAGCTCGCG 


8820 


ATCCAACAAA 


GGAACATCAT 


CCCGCATTGT 


CATGGTCGGT 


GCGGGGAGCC 


GGCGAGGCAG 


8880 


CGAAACCGAA 


AGTAGTGCTG 


GCGGCGCGGG 


CCCGGGTCCG 


GACCCAAGCT 


TCAGGGATGG 


8940 


GGGGCGGAGG 


C C AAAATCAA 


ACAAGCACCG 


CGCGGGTTCT 


ACACACAACC 


CCCACCCGGG 


9000 


TAGTATCCGC 


GGATGCGAGT 


GCCTGGCGAA 


GTCACGTCCC 


AGCAGGATAT 


AAACCTCGGC 


9060 


CGTTGGGCCC 


GGAACCCCCG 


AAATTCACAC 


CCACGCCCTG 


ACGCCCAAAT 


CATGGGTGGA 


9120 


TGTGGTTCGC 


GAGCCGCACA 


TCCGTGCGTC 


CGCCCTCCCC 


CGCGGGCTGA 


TGACGTGGCG 


9180 


GTTAGTCAGT 


GGGAAGGCAG 


GGGGAAAGAT 


GGGTTGGGGG 


AGGAAACGAA 


GAAAACACCC 


9240 


AGAGGGCCAC 


GTCGGGAATG 


CGCCCGGAGT 


TGTCCTTAAA 


AGGCCGGCCG 


TGCGTGACGG 


9300 


AAGCCGTCGT 


TTGCCCAAGC 


ACCGACGCCG 


CGATCCACAG 


TGGGGGGAGT 


TCCTCCGTCC 


9360 


GGCCACAACC 


CTACGCGCGG 


GCGGCACGCG 


CGAGAGCAAC 


CCACGGGTCC 


CGTTCGCGCC 


9420 


ACCGCCAGCC 


CTTGCTCCCA 


CCACCCTCCT 


CCCACCACCC 


CACTATTCCC 


CCCCCCCCAA 


9480 


GTCCGCCCCG 


TGGCTCGCCG 


GCCATGGAGC 


TCAGCTATGC 


CACCACCCTG 


CACCACCGGG 


9540 


ACGTTGTGTT 


TTACGTCACG 


GCAGACAGAA 


ACCGCGCCTA 


CTTTGTGTGC 


GGGGGGTCCG 


9600 


TTTATTCCGT 


AGGGCGGCCT 


CGGGATTCTC 


AGCCGGGGGA 


AATTGCCAAG 


TTTGGCCTGG 


9660 


TGGTCCGGGG 


GACAGGCCCC 


AAAGACCGCA 


TGGTCGCCAA 


CTACGTACGA 


AGCGAGCTCC 


9720 


GCCAGCGCGG 


CCTGCGGGAC 


GTGCGGCCCG 


TGGGGGAGGA 


CGAGGTGTTC 


CTGGACAGCG 


9780 


TGTGTCTGCT 


AAACCCGAAC 


GTGAGCTCCG 


AGCGAGACGT 


GATTAATACC 


AACGACGTTG 


9840 


AAGTGCTGGA 


CGAATGCCTG 


GCCGAATACT 


GCACCTCGCT 


GCGAACCAGC 


CCGGGGGTGC 


9900 


TGGTGACCGG 


GGTGCGCGTG 


CGCGCGCGAG 


ACAGGGTCAT 


CGAGCTATTT 


GAGCACCCGG 


9960 


CGATCGTCAA 


CATTTCCTCG 


CGCTTCGCGT 


ACACCCCCTC 


CCCCTACGTA 


TTCGCCCTGG 


10020 


CCCAGGCGCA 


CCTCCCCCGG 


CTCCCGAGCT 


CGCTGGAGCC 


CCTGGTGAGC 


GGCGTGTTTG 


10080 


ACGGCATTCC 


CGCCCCGCGC 


CAGCCCCTGG 


ACGCCCGCGA 


CCGGCGCACG 


GATGTCGTGA 


10140 


TCACGGGCAC 


CCGCGCCCCC 


AGACCGATGG 


CCGGGACCGG 


GGCCGGGGGC 


GCGGGGGCCA 


10200 


AGCGGGCCAC 


CGTCAGCGAG 


TTCGTGCAAG 


TGAAGCACAT 


CGACCGTGTT 


GTGTCCCCGA 


10260 


GCGTCTCTTC 


CGCCCCCCCG 


CCGAGCGCCC 


CCGACGCGAG 


TCTGCCGCCC 


CCGGGGCTCC 


10320 


AGGAGGCCGC 


CCCGCCGGGC 


CCCCCGCTCA 


GGGAGCTGTG 


GTGGGTGTTC 


TACGCGGGCG 


10380 


ACCGGGCGCT 


GGAGGAGCCC 


CACGCCGAGT 


CGGGATTGAC 


GCGCGAGGAG 


GTCCGCGCCG 


10440 


TGCATGGGTT 


CCGGGAGCAG 


GCGTGGAAGC 


TGTTTGGGTC 


GGTGGGGGCT 


CCGCGGGCGT 


10500 


TTCTCGGGGC 


CGCGCTGGCC 


CTGAGCCCGA 


CCCAAAAGCT 


CGCCGTCTAC 


TACTATCTCA 


10560 


TCCACCGGGA 


GCGGCGCATG 


TCCCCCTTCC 


CCGCGCTCGT 


GCGGCTCGTC 


GGTCGGTACA 


10620 


TCCAGGGCCA 


CGGCCTGTAC 


GTTCCCGCGC 


CCGACGAACC 


GACGTTGGCC 


GATGCCATGA 


10680 
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10 



15 



20 



25 



30 



35 



40 



ACGGGCTGTT 
ACCTCCTCCC 
TGCTGCGCTT 
TCATGTACCT 
CCACGCATAC 
ACCGGATGTC 
GGTACCTGGA 
GAGATATCCC 
GGGGGACTAT 
GAGGCGGTAG 
CGGTTGGGTT 
AAATGGGACG 
GACCCCCAGC 
AAAGGCCAGC 
GTCGCTGGAA 
GGCATCCGTG 
CACGGCCCCC 
AATCTCCCGC 
GACGTCAACG 
GTAGCTGGCG 
GCTACACTGG 
GCGCGTGAGA 
GGATCGGGTG 
CAGGGAAGCT 
GGTCCTGGCT 
CAGGGCCGTG 
GCCCGCCAGG 
CGGGGCGGTT 
CGCGCTCGGG 
GCCCAGCGCG 
GAGGCGCGCC 
CGAGTCGGCC 
GGCCAAAAAG 
GAACGCAAAC 
GTCCGTGCCG 
GGCGTAGTTC 
GCCGTTGTCG 
AGGCCCGGGG 
CTTGTCCGCA 
CCCCATGAAG 
CCCGCCCGTC 
CCGACCCGGA 
ATGCGCGATG 



CCGCGACGCG 
GCCCAAGGAC 
TGTGGACTCG 
CGGCGCGTTC 
CGCGCGCCTG 
CGCGTTTGAC 
CGCGCTGCTT 
AATAAAGTGC 
GGGGGGGGGG 
CGGACGCACG 
GGGCGTGGAC 
CACGTTCGGA 
GCGGACGCGG 
AGATGAATCA 
AACACGTTCG 
TCCACCAGCA 
ACGAGGCCGA 
GCGTGCCCTT 
CCCGTAAGCT 
TTGTGGGTGA 
TGGGTGGCCT 
ACGGAGGCCA 
GCCATGGCCA 
GCGCATGGGG 
AGCGCGGCCC 
GTTATGAGGA 
AACGGCGCCC 
AGCGCGCGGC 
GCCACCGCGC 
TGCGCCGCCA 
TCCAGCCGCA 
GCGCAGCCCG 
CCCAGCAGGT 
CCCGACGAGG 
GAGCCCGGGT 
GTGCTCTCCT 
GCGGGCGTCG 
AGCACCGCGG 
GGCACCACTA 
CCCTTCCCGT 
GTCCAGACGC 
GTCCGTAGCA 
CTCAGGTTCG 



CTGGCGGCCG GGACCGTGGC 
GTGCCGGTGG GGAGCGACGC 
CAACGCCTGA CCCCGGGGGG 
CTGGGCGTGT TGTACGCCGG 
ACGGGCGTGA CGTCCCTGGT 
CGCGGGCCGG CGGGGGCGGC 
ACCGTTTGCC TGGCTCGCGC 
AGTCGTTTTC TAACCCACGG 
GGAAAGGAAA GGAAACAGGA 
GCGGACACAA TAACAAACAG 
GCCGCTGCGT CCACACACCC 
CCACCCTGAG GATGCCCGCC 
CCAGAAACCC GGGGGCGATG 
CAGTTCCGTT GGGGAACAAC 
GGGTGCCCGC CACCGGCCCC 
GCACCGACAT GACCTCCCCG 
GGTCGCGCCG GTTTTCGGTG 
CGCAGGTGGC GGTGAGATAG 
TGTATCCGAT CCCGCGGGGC 
TGGGCACGAG GATCCGGGGC 
CCGGGACGAA GGCGCGGATC 
CGCCGCGGGT CTGTTGTGCC 
GCGCGTCCAG GATGAACCCG 
AAAAGTGGTC CGGGAGCCAG 
GGAGATCGGC GTGGGTCGCC 
GGCCCCGGCG GGCGCGTTCC 
GGAGGACGGC CGTGGCGTAA 
CGCCGAGAAA CTCGGCGTAC 
CATAGGCCGC GGGGCTGTCC 
GGCTCTGCTC TCGCTCGAGG 
GGCGGGCCGC CGGGTCCAAC 
CTGCTCCCCG GGCGGCCAGG 
CGGAGAGGCG AATCGCGTCG 
CGAGCAGCCC CGCGAGGCGC 
CCTCCCCCAA AAACTCCGCA 
CGGGGTAGCC GGCCACCCGC 
GGGCCCCCAG GACAAAGACG 
GGGCGTTTTC GTCGGTCGGA 
TGATCTCGGC CGGAGGGCTG 
ATCGCGCGCG CACGAGCGCG 
CCACGGGCCA CGTCGAGGCC 
GGCCCCTGGC GGCCAGCCAG 
TCGTCGGATG CCTCGGTGTC 

415 



CGAGCAGCTC 
GCGGGCCGAC 
GTCCGTCTCG 
CCACGGACGC 
CCTGACCGTG 
TGGCCGCACG 
CCAGCACGGC 
ATGCCGTTGT 
ATGGAGAAGG 
ACCGCGGACA 
GTTTATTCGC 
AGGGCCGCGG 
GTGGCGATGG 
AACAGGGCCA 
TGGGCCAGCT 
GCCGGGGTGT 
CGCACCAGCC 
GTGATAAACA 
AAGGGGGTGT 
TCCGCGTTGT 
AGGGCGTTGT 
ATGACGTCCG 
CCCTCGGCGA 
AAGAGGTTTT 
GCGGCGACGT 
CGCTGCTCGG 
AACAGCGCTC 
AGGGCGTCGA 
AACACGAACG 
ATCGCGGCCA 
ACGGACACGT 
CCGGCCAGCA 
TGGGCGTGGG 
CAGAACAGGG 
TAGGCCCGCG 
CGGAGGGCGT 
CGATACCTGG 
TTTCCGACCC 
TCCCGCATCG 
GCGTCGCACC 
GACGGGGAGA 
GTCACGGATG 
CCCGCGGGCG 



CTCATGTTCG 
AGCGCCGCCC 
CCCGAGCACG 
CTGGCCGCGG 
GGGGACGTCG 
CGAACCGCCG 
CAGTCTGTGT 
ATGCCTATAC 
GAAAGGAACA 
CGGAGGGAGT 
GTCTCCACAA 
TGATCATAAC 
GCAGCGTGTC 
CGGACGGCAC 
GCTGTTGGGT 
AGCGCAGAAA 
GCTTCGGCTC 
GCGGGCGGCG 
GGGTGACGAC 
GCGACGGGCC 
AGTGCGCCCA 
CCGGGATGTC 
GATCGAAGCG 
TCTGGTGGTC 
CGGACGTACA 
CCGAGGGCGC 
GGCGGACCAT 
TCAGGCGGGC 
CCAGCTGATA 
CCAGATGCCC 
TCAGGAACAC 
CGCGCGAGTG 
CCGCGTTGAC 
ACGGACGCGC 
ACATATACTG 
CCAGCGCCGA 
GGCCGGCCGG 
GAGCGAGGGT 
ATATCACGAG 
CGAACGCCAG 
GGTACACGTA 
CGTTGTGCAG 
GCCCCGGGGG 



10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 
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CGGCGCGTTG CGTCGGCCGT CCGGGTGCCT 
CGTAAGCCCC TCGCGGTCCG GCGCGGCCGC 
CGCGGAGGCG CCGGGGTTGT GCGACAGTCC 
GGACGTGGGC CCCGCCTCGG GGAGCTCGGG 
5 ATAGGTCTTT GGGATGTAAA GCAGCTGCCC 
CAACACGAAA CAAAAGCGCT CGGCGTACCA 
GTTGAGTTCG CCCGGGGGCG CCAAGCGTCC 
GGGCAACCAC AGACGCCCGG TGTTTGTGTC 
GTGCAAAAAC CACGGGTCGA TTTGCTCCGT 

10 CACCAGGACC CCCATCACCA CCCACAGACC 
GTCCGCAGAT GGGGGGGTGT CCGTACCCAC 
ATCGGAGGCC CTTTGTTGCC GTAAGCGCGG 
GCACAAAGGG AGTACCAGAC CGAAAACAAG 
GGTGCTGATA CGGTGCCAGC CCTGGCCCCG 

15 CAACAATGCC TTTTATTCTG TTCTTTTATT 
GCTTCCGTGT TTGAACTAAA CTCCCCCCAC 
TATCTCGGCG ATGGACCCGG CGGTTGTGAC 
CAACAGGGCG TCTCGACACC CGACGGGCGA 
CGGAAGGAGG CGGTCGGCCA AGACGTCCAA 

20 GTTGGGGGCC AGCAGCTCGG GAACGCGGAA 
CAGAGACCCC GCGCCGTCCT CGGGGTCGGG 
CCGGCCCCAG TCCTCCCGCC ACCTCCCGCC 
TAGATCGTAG ACACGGCGAA TGGCGGACAG 
TTGGCGTCTG GCCAGGCGGT CGGCGTGTTC 

25 CGTGCCGGGC GCGGTCGGGG GCATGAGGGC 
GCTTCCCATG AGGTACCGCG CGGCCGGGTA 
GAAAACAAGG GTGAGGGCCG GGGGCGGGGC 
AGGAGCCAAA ACGGCGTCCG TCGCCGCATA 
CATTACCACC GCCGCCTCCC CGGCCGATAT 

30 GTAGATGTTC GTCAGGGTCT CGGAGGCCCC 
GACGTAGACG ATATTGTCGC GCGGCCCCAG 
CTTCCCCACC CCGTGGGGTC CGTCTATATA 
CCCGCGGGCT TCGGAGGCCC CCTGGCGATG 
AGGCCCGCTC GCACGAGCAG CCTGACCGAA 

35 AGCCATACCC GCTTCTACAA GGCGTTCGCC 
ATTTGTGGAA CGCTGCTGAC GCTGATGAGC 
GCCACGCGCG TCACCTTAAT ATGCGAAGTG 
TGCGTGTTCG AATTCGCCAA TGACAAAACG 
AAGACATGCA AATCGATTTC TTCCGGGGAC 

40 GGCATGAAGC AGCTGCGCCA CTCCCTGAAG 
AAGGTCGTCT ACCTGTGTCC TATTTTGGTG 
CGCGTGACCC GGCTCGTCCC GCAAAAGATC 
CTCCAAAGCC TGTCCACGTA TGCCGTGCCG 



CTCGGTCGCC 


CCGTCGTCTC 


CCCGCGGGAA 


13320 


GAATGTTACC 


CAGGCCCGGG 


ACCGCAACAG 


13380 


CTTGAGCTGG 


GTCACCTCGG 


CGGGGGGACG 


13440 


CAGGCTCGCG 


TTCCGAGGCC 


GGCCGAGCAG 


13500 


GGGGTCCCGA 


GGAAACTCGG 


CCGTGGTGAC 


13560 


CCGAAGCATG 


GGCACGGATG 


CCGTAGTCAG 


13620 


GCGCTGGGGG 


TCGCTGGCGT 


CGGGGGTGTT 


13680 


GCGCCAGTAC 


GTGCGGGCCA 


ACCCCAGACC 


13740 


CCAGTACGTG 


TCATGGCCCC 


CGGCAACGCC 


13800 


GGGGCCCATG 


GTCGTCCGTC 


CCGGCTGCCA 


13860 


GGCCCAAAGA 


GGCTCCGCAC 


CTCGGAGGCT 


13920 


GCCAAAGGAT 


GGGGTGGGGT 


GAGGGTAAAA 


13980 


GACGGATCGG 


CCCGCTCCGT 


TTTTCGGTGG 


14040 


AACCCCCGCG 


CTTATGGACA 


CACCACACGA 


14100 


GCCGTCATCG 


CCGGGAGGCC 


TTCCGTTCGG 


14160 


CTCGCGGGCA 


AACGTGCGCG 


CCAGGTCGCG 


14220 


GCGGGTTGGG 


ATCATCCCGG 


CGGTGAGGCG 


14280 


CTGATCGTAA 


TCCAGGACAA 


ATAGATGCAT 


14340 


GACCCAGGCA 


AAAATGTGGT 


ACAAGTCCCC 


14400 


CAGGGCAAAC 


AGCGTGTCCT 


CGATGCGGGG 


14460 


GCGCGGGGTC 


GCCGCGGCGA 


CCCCCGTCAG 


14520 


GCGCTGCAGG 


TACCGCACCG 


TGTTGGCGAG 


14580 


CATGGCCAGG 


TCAAGCCGCT 


CGCCCGGGCG 


14640 


GGCCTCCGGA 


AGGACACCCA 


GGACCAGGTT 


14700 


CACGAACGCC 


AACACGGCCT 


GGGGGGTCAT 


14760 


GCACAGCAGG 


GAGGCGATAG 


GGTGCCGGTC 


14820 


TTGCGGGCCC 


ACAGCCTCCC 


GCCCGATATG 


14880 


AGGCGTGCTC 


ATTGTTATCT 


GGGCGCTGGT 


14940 


CTCGCCGCGG 


TCCAGACGGT 


GCTGCGTGTT 


15000 


CAGCACCTGC 


CAGTAAGTCA 


TCGGCTCGGG 


15060 


GGCCTCCATC 


AGCTGCGCGG 


AGGTGGTGGT 


15120 


AACCCGCAGC 


AGCGTGGGCA 


GCTCCGGATC 


15180 


GCTAGGACGG 


GACGCCGCGC 


GGCCGTCGGT 


15240 


CGCAGGCGCG 


TGCTGTTGGC 


CGGCGTGAGA 


15300 


CGAGAGGTGC 


GGGAGTTCAA 


CGCCACCAGG 


15360 


GGGTCGCTGC 


AGGGTCGCTC 


GCTGTTCGAG 


15420 


GACCTCGGGC 


CGCGCCGCCC 


AGACTGCATC 


15480 


TTGGGAGGTG 


TGTGCGTCAT 


CCTGGAGCTA 


15540 


ACGGCCAGCA 


AACGCGAACA 


GCGGACCACG 


15600 


CTGCTGCAGT 


CGCTCGCGCC 


TCCGGGGGAC 


15660 


TTTGTCGCGC 


AGCGTACGCT 


GCGCGTCAGC 


15720 


TCCGGCAACA 


TCACCGCGGC 


CGTGCGGATG 


15780 


CCGGAACCGC 


AGACCCGGCG 


GTCGCGGCGC 


15840 
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CGGGTCGCCG CGACCGCCAG ACCGCAAAGG 
ACGGCGGGTC ATCCGGCCCC ACCAGAGAGC 
GCTGCGGAGG GTGGGGGTGT GCTTCAGAAA 
GCCAAGAGCA GACCCCGGAC CAAAACCGAG 
5 TTTTGTTTTC TCTTCTTTCC CCCCCCCCTC 
TTAAGCGGAA CCCGCGGGCG CGCGGGGACT 
CAGCCCCTGG GTGTAGACCG CTGTCGCCCC 
TCAAAGAACG TGGTGTTGGG CGCCGGCCAA 
GCCGCCCTCG AACATGGACC CGTACTACCC 

10 GCGCTTCATC GTCGCCGACT CCAGGAGCTT 
GATGTTGCCC GTGTTCAACA TCCCCCGGGA 
GGCCCAGCGC ACCGCGGCCG CGGCGGCCCT 
GCCCGTCGAC ATCGAGCGCC GGATACGCCC 
CGCCCTGGAG GCGCTGGAGA CCGCGGCGGC 

15 CGCCGAGGCG AGGGGGGAGG GCGCTGCGGA 
CGCCGCCGCG GAGATGGAGG TTCAGATCGT 
CAACCTCCCC GTGGATCTGC TACACATGGT 
GGGAGTCGTC TTTGGTACCT GGTACCGCAC 
CCTGACCACC CGCAGCGCCG ACTTTCGAGA 

20 GCTGGTCCTG TCTCTGCAGT CGTGCGGCCG 
CTTCGAGTGC GCCGTGCTGT GTCTGTATCT 
CGATCGCGAT CGCGCTCCCG TTGCGTTCGG 
GGCGCGTCTG GCCGCGGTAA TCGGCGACGA 
CGACAAGCTG CCCAAAGCGC AGTTCGCGGC 

25 GGCCACCCAC GTCGTGATCG CCACGTTGGT 
CGACGTTCCC CGAGACACCA GCACCCGCGT 
CGTCAACCGC GCCGCCGCCG CGTTTTTGGC 
CCAGACGCTG CTGCGGGCGA CCGCCAACAC 
CCTCGCGAAC GGCAACGTGT ACGCGGACCG 

30 GATCCCGGGA GCCGTCCCGG CGGAGGCCAT 
CGCCATAAAA AGCGGCGACA ACAACCTGGA 
GTATCAGGCA GACCCCACGG TCGAGCTGAC 
CCTGGACGCC CAGGCGGGGC GGCCACTGGC 
GGGCGCCCGC CAGGCGGGGC TCGTGCGCCT 

35 CACAAACACC ACCCCTGTGG GGGAGATTAT 
CGAACAGGGC CTGGGGCTGC TCGCCCAGCA 
GCGATTCGCC ACGTTCAACG TGGGCAGCGA 
GTTCATTCCC CAGTACCTGT CCGTGGCCTA 
TTTTTCTGCT GTTGTTGTTT CTGGTCCGCC 

40 CGGGCTTTAG TCCCGGCCCG GACGTCGGCG 
GGTAAGTTGG TTCGGGGGCA TCGCTGTATT 
TTTGTTTGTT TGTGCGGGTG CCCATGGCGT 
CTCTGCCCGA CCGGGCGGTG CCCATCTACG 



CCCCCCTCCC 


CGACACGTGA 


CCCGGAAGGC 


15900 


GACCCCCCCT 


CCCCAGGGGT 


CGTAGGCGTC 


15960 


ATCGCGGCGC 


TTTTTTGCGT 


GCCGGTGGCC 


16020 


TGAGGTTCTG 


TGTGTTGTTT 


TTTTTCCTCG 


16080 


CCCCGCTTCT 


GGCCAAGCAT 


CCTCACCTGC 


16140 


CATTTGTCGC 


CGGCGACACC 


CACCCGACAA 


16200 


CGTCTGTCGC 


CTCTCCCTTT 


TTTCCCCCCC 


16260 


TTCTTCCCGG 


AGCGCCGTCG 


TCGCCCGCCC 


16320 


TTTCGACGCG 


CTGGACGTTT 


GGG AACACAG 


16380 


CATCACCCCC 


GAGTTCCCCC 


GGGACTTCTG 


16440 


GACGGCGGCG 


GAGCGGGCGG 


CAGTGCTGCA 


16500 


GGAGAACGCC 


GCCCTCCAGG 


CCGCCGAGCT 


16560 


GATCGAGCAG 


CAGGTGCATC 


ACATCGCCGA 


16620 


CGCGGCCGAA 


GAGGCGGATG 


CCGCGCGGGA 


16680 


CGGGGCAGCG 


CCGTCGCCCA 


CCGCGGGCCC 


16740 


ACGCAACGAC 


CCGCCGCTAC 


GATACGATAC 


16800 


GTACGCGGGC 


CGCGGGGCCG 


CGGGTTCGTC 


16860 


GATCCAGGAA 


CGCACCATCG 


CGGACTTCCC 


16920 


CGGGCGCATG 


TCCAAGACCT 


TCATGACCGC 


16980 


GCTGTACGTG 


GGCCAGCGCC 


ACTATTCCGC 


17040 


GCTGTACCGA 


ACCACCCACG 


AGTCCTCCCC 


17100 


GGACCTGCTG 


GCCCGCCTGC 


CGCGCTACCT 


17160 


GAGCGGACGC 


CCGCAGTACC 


GCTACCGCGA 


17220 


GGCCGGCGGC 


CGCTACGAGC 


ACGGGGCCCT 


17280 


GCGCCACGGG 


GTGCTACCGG 


CGGCCCCGGG 


17340 


GAACCCCGAC 


GACGTGGCCC 


ACCGCGACGA 


17400 


ACGCGGCCAC 


AACCTCTTCC 


TGTGGGAGGA 


17460 


CATTACGGCC 


CTGGCCGTGC 


TTCGGCGGCT 


17520 


CCTCGACAAC 


CGCCTGCAGC 


TGGGCATGCT 


17580 


CGCTCGGGGG 


GCGTCCGGAT 


TGGACTCGGG 


17640 


GGCGCTGTGC 


GTTAACTATG 


TACTTCCGCT 


17700 


CCAGTTGTTT 


CCGGGGCTGG 


CCGCCCTGTG 


17760 


GTCGACGAGG 


CGCGTGGTGG 


ATATGTCGTC 


17820 


CACCGCGCTG 


GAGCTCATCA 


ACCGCACCCG 


17880 


TAACGCCCAC 


GATGCCTTGG 


GGATACAATA 


17940 


GGCACGCATC 


GGCTTGGCGT 


CGAACGCCAA 


18000 


CTACGACCTG 


TTGTACTTTT 


TGTGTCTCGG 


18060 


GGGAAGGGTG 


GGGGTGGTGG 


TGGTGGGGTG 


18120 


TGGTCACAAA 


AGGCACGGCG 


CCCCGAAACG 


18180 


GACACACAAC 


AACGGCGGGC 


CCCGTGGGTG 


18240 


CCCTTGCCCG 


CTTCCACCCC 


CCCTTCCCGT 


18300 


CGGCGGAAAT 


GCGCGAGCGG 


TTGGAGGCGC 


18360 


TGGCCGGGTT 


TTTGGCCCTG 


TACGACAGCG 
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GGGACCCGGG CGAGCTGGCC CTGGACCCAG ACACGGTGCG TGCGGCCCTG CCTCCGGAGA 18480 

ACCCCCTGCC GATCAACGTA GACCACCGCG CTCGGTGCGA GGTGGGCCGG GTGCTCGCCG 18540 

TGGTCAACGA CCCTCGGGGG CCGTTTTTTG TGGGGCTGAT CGCGTGCGTG CAGCTGGAGC 18600 

GCGTCCTCGA GACGGCCGCC AGCGCCGCTA TTTTTGAGCG CCGCGGACCC GCGCTCTCCC 18660 

5 GGGAGGAGCG TCTGCTGTAC CTGATCACCA ACTACCTGCC ATCGGTCTCG CTGTCCACAA 18720 

AACGCCGGGG GGACGAGGTT CCGCCCGACC GCACCCTGTT TGCGCACGTG GCCCTGTGCG 18780 

CCATCGGGCG GCGCCTTGGA ACCATCGTCA CCTACGACAC CAGCCTAGAC GCGGCCATCG 18840 

CTCCGTTTCG CCACCTGGAC CCGGCGACGC GCGAGGGGGT GCGACGCGAG GCCGCCGAGG 18900 

CCGAGCTCGC GCTGGCCGGG CGCACCTGGG CCCCCGGCGT GGAGGCGCTC ACACACACGC 18960 

10 TGCTCTCCAC CGCCGTCAAC AACATGATGC TGCGTGACCG CTGGAGCCTC GTGGCCGAGC 19020 

GGCGGCGGCA GGCCGGGATC GCCGGACACA CGTACCTTCA GGCGAGCGAA AAATTTAAAA 19080 

TATGGGGGGC GGAGTCTGCC CCTGCGCCGG AGCGCGGGTA TAAAACCGGC GCCCCGGGTG 19140 

CCATGGACAC ATCCCCCGCC GCGAGCGTTC CCGCGCCGCA GGTCGCCGTC CGTGCGCGTC 19200 

AAGTCGCGTC GTCGTCGTCT TCTTCTTCTT CTTTTCCGGC ACCGGCCGAT ATGAACCCCG 19260 

15 TTTCGGCATC GGGCGCCCCG GCCCCTCCGC CGCCCGGCGA CGGGAGTTAT TTGTGGATCC 19320 

CCGCCTTTCA TTACAATCAG CTCGTCACCG GGCAATCCGC GCCCCACCAC CCGCCGCTGA 19380 

CCGCGTGCGG CCTGCCGGCC GCGGGGACGG TGGCCTACGG ACACCCCGGC GCCGGCCCGT 19440 

CCCCGCACTA CCCGCCTCCT CCCGCCCACC CGTACCCGGG TATGCTGTTC GCGGGCCCCA 19500 

GTCCCCTGGA GGCCCAGATC GCCGCGCTGG TGGGGGCCAT CGCCGCCGAC CGCCAGGCGG 19560 

20 GTGGGCTTCC GGCGGCCGCC GGAGACCACG GGATCCGGGG GTCGGCGAAG CGCCGCCGAC 19620 

ACGAGGTGGA GCAGCCGGAG TACGACTGCG GCCGTGACGA GCCGGACCGG GACTTCCCGT 19680 

ATTACCCGGG CGAGGCCCGC CCCGAGCCGC GCCCGGTCGA CTCCCGGCGC GCCGCGCGCC 19740 

AGGCTTCCGG GCCCCACGAA ACCATCACGG CGCTGGTGGG GGCGGTGACG TCCCTGCAGC 19800 

AGGAACTGGC GCACATGCGC GCGCGTACCC ACGCCCCCTA CGGGCCGTAT CCGCCGGTGG 19860 

25 GGCCCTACCA CCACCCCCAC GCAGACACGG AGACCCCCGC CCAACCACCC CGCTACCCCG 19920 

CCGAGGCCGT CTATCTGCCG CCGCCGCACA TCGCCCCCCC GGGGCCTCCT CTATCCGGGG 19980 

CGGTCCCCCC ACCCTCGTAT CCCCCAGTTG CGGTTACCCC CGGTCCCGCC CCCCCGCTAC 20040 

ATCAGCCCTC CCCCGCACAC GCCCACCCCC CTCCGCCGCC GCCGGGACCC ACGCCTCCCC 20100 

CCGCCGCGAG CTTACCCCAA CCCGAGGCGC CCGGCGCGGA GGCCGGCGCC TTAGTTAACG 20160 

30 CCAGCAGCGC GGCCCACGTG AACGTGGACA CGGCCCGGGC CGCCGATCTG TTTGTGTCAC 20220 

AGATGATGGG GTCCCGCTAA CTCGCCTCCA GGATCCGGAC TTGGGGGGGG TGTGTGTTTT 20280 

CATATATTTT AAATAAACAA ACAACCGGAC AAAAGTATAC CCACTTCGTG TGCTTGTGTT 20340 

TTTGTTTGAG AGGGGGGGGG TGGAGTGGGG GGGAAAGTGG GCCGAAT 20388 

35 (2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 262 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

418 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Met Asn Ala His Phe Ala Asn Glu Val Gin Tyr Asp Leu Thr Arg Asp 
5 1 5 10 15 

Pro Ser Ser Pro Ala Ser Leu lie His Val lie lie Ser Ser Glu Cys 

20 25 30 

Leu Ala Ala Ala Gly Val Pro Leu Ser Ala Leu Val Arg Gly Arg Pro 
.35 40 45 

10 Asp Gly Gly Ala Ala Ala Asn Phe Arg Val Glu Thr Gin Thr Arg Ala 
50 ' 55 60 

His Ala Thr Gly Asp Cys Thr Pro Trp Arg Ser Ala Phe Ala Ala Tyr 
65 70 * 75 80 

Val Pro Ala Asp Ala Val Gly Ala lie Leu Ala Pro Val He Pro Ala 
15 85 90 95 

His Pro Asp Leu Leu Pro Arg Val Pro Ser Ala Gly Gly Leu Phe Val 

100 105 110 

Ser Leu Pro Val Ala Cys Asp Ala Gin Gly Val Tyr Asp Pro Tyr Thr 
115 120 125 

20 Val Ala Ala Leu Arg Leu Ala Trp Gly Pro Trp Ala Thr Cys Ala Arg 
130 135 140 

Val Leu Leu Phe Ser Tyr Asp Glu Leu Val Pro Pro Asn Thr Arg Tyr 
145 150 155 160 

Ala Ala Asp Gly Ala Arg Leu Met Arg Leu Cys Arg His Phe Cys Arg 
25 165 170 175 

Tyr Val Ala Arg Leu Gly Ala Ala Ala Pro Ala Ala Ala Thr Glu Ala 

180 185 190 

Ala Ala His Leu Ser Leu Gly Met Gly Glu Ser Gly Thr Pro Thr Pro 
195 200 205 

30 Gin Ala Ser Ser Val Ser Gly Gly Ala Gly Pro Ala Val Val Gly Thr 
210 215 220 

Pro Asp Pro Pro He Ser Pro Glu Glu Gin Leu Thr Ala Pro Gly Gly 
225 230 235 240 

Asp Thr Ala Thr Ala Glu Asp Val Ser He Thr Gin Glu Asn Glu Glu 
35 245 250 255 

He Xaa Xaa Xaa Xaa Xaa 
260 



40 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 423 amino acids 

(B) TYPE: amino acid 

419 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Val Pro Glu Gly Ala Trp Val Gly Gly Ala Cys Ala Arg Pro Arg Gly 
1 5 10 15 

10 Pro. Arg Ala His Val Arg Leu Tyr Ala Val Cys Phe Val Cys Pro Gin 
20 25 30 

Gly He Arg Gly Gin Asp Phe Asn Leu Leu Phe Val Asp Glu Ala Asn 

35 40 45 

Phe He Arg Pro Asp Ala Val Gin Thr He Met Gly Phe Leu Asn Gin 
15 50 55 60 

Ala Asn Cys Lys He He Phe Val Ser Ser Thr Asn Thr Gly Lys Ala 
65 70 75 80 

Ser Thr Ser Phe Leu Tyr Asn Leu Arg Gly Ala Ala Asp Glu Leu Leu 
85 90 95 

20 Asn Val Val Thr Tyr He Cys Asp Asp His Met Pro Arg Val Val Thr 
100 105 110 

His Thr Asn Ala Thr Ala Cys Ser Cys Tyr He Leu Asn Lys Pro Val 

115 120 125 

Phe He Thr Met Asp Gly Ala Val Arg Arg Thr Ala Asp Leu Phe Leu 
25 130 135 140 

Pro Asp Ser Phe Met Gin Glu He He Gly Gly Gin Ala Arg Glu Thr 
145 150 155 160 

Gly Asp Asp Arg Pro Val Leu Thr Lys Ser Ala Gly Glu Arg Phe Leu 
165 170 175 

30 Leu Tyr Arg Pro Ser Thr Thr Thr Asn Ser Gly Leu Met Ala Pro Glu 
180 185 190 

Leu Tyr Val Tyr Val Asp Pro Ala Phe Thr Ala Asn Thr Arg Ala Ser 

195 200 205 

Gly Thr Gly He Ala Val Val Gly Arg Tyr Arg Asp Asp Phe He He 
35 210 215 220 

Phe Ala Leu Glu His Phe Phe Leu Arg Ala Leu Thr Gly Ser Ala Pro 
225 230 235 240 

Ala Asp He Ala Arg Cys Val Val His Ser Leu Ala Gin Val Leu Ala 
245 250 255 

40 Leu His Pro Gly Ala Phe Arg Ser Val Arg Val Ala Val Glu Gly Asn 
260 265 270 

Ser Ser Gin Asp Ser Ala Val Ala He Ala Thr His Val His Thr Glu 
275 280 285 
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Met His Arg lie Leu Ala Ser Ala Gly Ala Asn Gly Pro Gly Pro Glu 

290 295 300 

Leu Leu Phe Tyr His Cys Glu-. Pro Pro Gly Gly Ala Val Leu Tyr Pro 
305 310 315 320 

5 Phe Phe Leu Leu Asn Lys Gin Lys Thr Pro Ala Phe Glu Tyr Phe lie 

325 330 335 

Lys Lys Phe Asn Ser Gly Gly Val Met Ala Ser Gin Glu Leu Val Ser 

340 345 350 

Val Thr Val Arg Leu Gin Thr Asp Pro Val Glu Tyr Leu Ser Glu Gin 
10 355 360 365 

Leu Asn Asn Leu lie Glu Thr Val Ser Pro Asn Thr Asp Val Arg Met 

370 375 380 

Tyr Ser Gly Lys Arg Asn Gly Ala Ala Asp Asp Leu Met Val Ala Val 
385 390 395 400 

15 lie Met Ala lie Tyr Leu Ala Ala Pro Thr Gly lie Pro Pro Ala Phe 

405 410 415 

Phe Pro lie Thr Arg Thr Ser 
420 

20 (2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:146: 

Val Leu Leu Ser Pro Ala Pro Pro Pro Leu Pro His Gly Arg Cys Pro 

15 10 15 

Pro Ser Leu Phe His His Arg Pro Gly Cys Val Ser Gly Pro Pro Ala 
35 20 25 30 

Pro Pro Arg Ser Gly Val Ser Arg Pro Gly Ala Met lie Thr Asp Cys 

35 40. 45 

Phe Glu Ala Asp lie Ala lie Pro Ser Gly lie Ser Arg Pro Asp Ala 
50 55 60 

40 Ala Ala Leu Gin Arg Cys Glu Gly Arg Val Val Phe Leu Pro Thr lie 
65 70 75 80 

Arg Arg Gin Leu Ala Asp Val Ala His Glu Ser Phe Val Ser Gly Gly 
85 90 95 
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Val Ser Pro Asp Thr Leu Gly Leu Leu Leu Ala Tyr Arg Arg Arg Phe 

100 105 110 

Pro Ala Val He Thr Arg Val - Leu Pro Thr Arg He Val Ala Cys Pro 
115 120 125 

5 Val Asp Leu Gly Leu Thr His Ala Gly Thr Val Asn Leu Arg Asn Thr 
130 135 140 

Ser Pro Val Asp Leu Cys Asn Gly Asp Pro Val Ser Leu Val Pro Pro 
145 150 155 160 

Val Phe Glu Gly Gin Ala Thr Asp Val Arg Leu Glu Ser Leu Asp Leu 
10 165 170 175 

Thr Leu Arg Phe Pro Val Pro Leu Pro Thr Pro Leu Ala Arg Glu He 

180 185 190 

Val Ala Arg Leu Val Arg He Arg Asp Leu Asn Pro Asp Pro Arg Thr 
195 200 205 

15 Pro Gly Glu Leu Pro Asp Leu Asn Val Leu Tyr Tyr Asn Gly Ala Arg 
210 215 220 

Leu Ser Leu Val Ala Asp Val Gin Gin Leu Ala Ser Val Asn Thr Glu 
225 230 235 240 

Leu Arg Ser Leu Val Leu Asn Met Val Tyr Ser He Thr Glu Gly Thr 
20 245 250 255 

Thr Leu He Leu Thr Leu He Pro Arg Leu Leu Ala Leu Ser Ala Gin 

260 265 270 

Asp Gly Tyr Val Asn Ala Leu Leu Gin Met Gin Ser Val Thr Arg Glu 
275 280 285 

25 Ala Ala Gin Leu He His Pro Glu Ala Pro Met Leu Met Gin Asp Gly 
290 295 300 

Glu Arg Arg Leu Pro Leu Tyr Glu Ala Leu Val Ala Trp Leu Ala His 
305 310 315 320 

Ala Gly Gin Leu Gly Asp He Leu Ala Pro Ala Val Arg Val Cys Thr 
30 325 330 335 

Phe Asp Gly Ala Ala Val Val Gin Ser Gly Asp Met Ala Pro Val He 
340 345 350 

Arg Tyr Pro 
355 



35 



(2) INFORMATION FOR SEQ ID NO: 147: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1382 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

5 Val Trp Glu Gly Leu Gly Leu Pro Glu Leu Gly Leu Met Glu Pro Ala 
1 5 10 15 

Asn Pro Pro Arg Asn Pro Met Ala Ala Pro Ala Arg Asp Pro Pro Gly 

20 25 30 

Tyr Arg Tyr Ala Ala Ala Met Val Pro Thr Gly Ser lie Leu Ser Thr 
10 35 40 45 

lie Glu Val Ala Ser His Arg Arg Leu Phe Asp Phe Phe Ala Arg Val 

50 55 60 

Arg Ser Asp Glu Asn Ser Leu Tyr Asp Val Glu Phe Asp Ala Leu Leu 
65 70 75 80 

15 Gly Ser Tyr Cys Asn Thr Leu Ser Leu Val Arg Phe Leu Glu Leu Gly 

85 90 95 

Leu Ser Val Ala Cys Val Cys Thr Lys Phe Pro Glu Leu Ala Tyr Met 

100 105 110 

Asn Glu Gly Arg Val Gin Phe Glu Val His Gin Pro Leu He Ala Arg 
20 115 120 125 

Asp Gly Pro His Pro Val Glu Gin Pro Val His Asn Tyr Met Thr Lys 

130 135 140 

Val He Asp Arg Arg Ala Leu Asn Ala Ala Phe Ser Leu Ala Thr Glu 
145 150 155 160 

25 Ala He Ala Leu Leu Thr Gly Glu Ala Leu Asp Gly Thr Gly He Ser 

165 170 175 

Leu His Arg Gin Leu Arg Ala He Gin Gin Leu Ala Arg Asn Val Gin 

180 185 190 

Ala Val Leu Gly Ala Phe Glu Arg Gly Thr Ala Asp Gin Met Leu His 
30 195 200 205 

Val Leu Leu Glu Lys Ala Pro Pro Leu Ala Leu Leu Leu Pro Met Gin 

210 215 220 

Arg Tyr Leu Asp Asn Gly Arg Leu Ala Thr Arg Val Ala Arg Ala Thr 
225 230 235 240 

35 Leu Val Ala Glu Leu Lys Arg Ser Phe Cys Asp Thr Ser Phe Phe Leu 

245 250 255 

Gly Lys Ala Gly His Arg Arg Glu Ala He Glu Ala Trp Leu Val Asp 

260 265 270 

Leu Thr Thr Ala Thr Gin Pro Ser Val Ala Val Pro Arg Leu Thr His 
40 275 280 285 

Ala Asp Thr Arg Gly Arg Pro Val Asp Gly Val Leu Val Thr Thr Ala 

290 295 300 

Ala He Lys Gin Arg Leu Leu Gin Ser Phe Leu Lys Val Glu Asp Thr 
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305 310 315 320 

Glu Ala Asp Val Pro Val Thr Tyr Gly Glu Met Val Leu Asn Gly Ala 

325 - 330 335 

Asn Leu Val Thr Ala Leu Val Met Gly Lys Ala Val Arg Ser Leu Asp 
5 340 345 350 

Asp Val Gly Arg His Leu Leu Glu Met Gin Glu Glu Gin Leu Glu Ala 

355 360 365 

Asn Arg Glu Thr Leu Asp Glu Leu Glu Ser Ala Pro Gin Thr Thr Arg 
370 375 380 

10 Val Arg Ala Asp Leu Val Ala lie Gly Asp Arg Leu Val Phe Leu Glu 
385 390 395 400 

Ala Leu Glu Lys Arg lie Tyr Ala Ala Thr Asn Val Pro Tyr Pro Leu 

405 410 415 

Val Gly Ala Met Asp Leu Thr Phe Val Leu Pro Leu Gly Leu Phe Asn 
15 420 425 430 

Pro Ala Met Glu Arg Phe Ala Ala His Ala Gly Asp Leu Val Pro Ala 

435 440 445 

Pro Gly His Pro Glu Pro Arg Ala Phe Pro Pro Arg Gin Leu Phe Phe 
450 455 460 

20 Trp Gly Lys Asp His Gin Val Leu Arg Leu Ser Met Glu Asn Ala Val 
465 470 475 480 

Gly Thr Val Cys His Pro Ser Leu Met Asn lie Asp Ala Ala Val Gly 

485 490 495 

Gly Val Asn His Asp Pro Val Glu Ala Ala Asn Pro Tyr Gly Ala Tyr 
25 500 505 510 

Val Ala Ala Pro Ala Gly Pro Gly Ala Asp Met Gin Gin Arg Phe Leu 

515 520 525 

Asn Ala Trp Arg Gin Arg Leu Ala His Gly Arg Val Arg Trp Val Ala 
530 535 540 

30 Glu Cys Gin Met Thr Ala Glu Gin Phe Met Gin Pro Asp Asn Ala Asn 
545 550 555 560 

Leu Ala Leu Glu Leu His Pro Ala Phe Asp Phe Phe Ala Gly Val Ala 

565 570 575 

Asp Val Glu Leu Pro Gly Gly Glu Val Pro Pro Ala Gly Pro Gly Ala 
35 580 585 590 

lie Gin Ala Thr Trp Arg Val Val Asn Gly Asn Leu Pro Leu Ala Leu 

595 600 605 

Cys Pro Val Ala Phe Arg Asp Arg Leu Glu Leu Gly Val Gly Arg His 
610 615 620 

40 Ala Met Ala Pro Ala Thr He Ala Ala Val Arg Gly Ala Phe Glu Asp 
625 630 635 640 

Arg Ser Tyr Pro Ala Val Phe Tyr Leu Leu Gin Ala Ala He His Gly 
645 650 655 
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Ser Glu His Val Phe Cys Ala Arg Leu Val Thr Gin Cys He Thr'Ser 

660 665 670 

Tyr Trp Asn Asn Thr Arg Cys -Ala Ala Phe Val Asn Asp Tyr Ser Leu 
675 680 685 

5 Val Ser Tyr He Val Thr Tyr Leu Gly Gly Asp Leu Pro Glu Glu Cys 
690 695 700 

Met Ala Val Tyr Arg Asp Leu Val Ala His Val Glu Ala Gin Leu Val 
705 710 715 720 

Asp Asp Phe Thr Leu Pro Gly Pro Glu Leu Gly Gly Gin Ala Gin Ala 
10 725 730 735 

Glu Leu Asn His Leu Met Arg Asp Pro Ala Leu Leu Pro Pro Leu Val 

740 745 750 

Trp Asp Cys Asp Gly Leu Met Arg His Ala Ala Leu Asp Arg His Arg 
755 .760 765 

15 Asp Cys Arg He Asp Ala Gly Gly His Glu Pro Val Tyr Ala Ala Ala 
770 775 780 

Cys Asn Val Ala Thr Ala Asp Phe Asn Arg Asn Asp Gly Arg Leu Leu 
785 790 795 800 

His Asn Thr Gin Ala Arg Ala Ala Asp Ala Ala Asp Asp Arg Pro His 
20 805 810 815 

Arg Pro Ala Asp Trp Thr Val His His Lys He Tyr Tyr Tyr Val Leu 

820 825 830 

Val Pro Ala Phe Ser Arg Gly Arg Cys Cys Thr Ala Gly Val Arg Phe 
835 840 845 

25 Asp Arg Val Tyr Ala Thr Leu Gin Asn Met Val Val Pro Glu He Ala 
850 855 860 

Pro Gly Glu Glu Cys Pro Ser Asp Pro Val Thr Asp Pro Ala His Pro 
865 870 875 880 

Leu His Pro Ala Asn Leu Val Ala Asn Thr Val Asn Ala Met Phe His 
30 885 890 895 

Asn Gly Arg Val Val Val Asp Gly Pro Ala Met Leu Thr Leu Gin Val 

900 905 910 

Leu Ala His Asn Met Ala Glu Arg Thr Thr Ala Leu Leu Cys Ser Ala 
915 920 925 

35 Ala Pro Asp Ala Gly Ala Asn Thr Ala Ser Thr Ala Asn Met Arg -He 
930 935 940 

Phe Asp Gly Ala Leu His Ala Gly Val Leu Leu Met Ala Pro Gin His 
945 950 955 960 

Leu Asp His Thr He Gin Asn Gly Glu Tyr Phe Tyr Val Leu Pro Val 
40 965 970 975 

His Ala Leu Phe Ala Gly Ala Asp His Val Ala Asn Ala Pro Asn Phe 

980 985 990 

Pro Pro Ala Leu Arg Asp Leu Ala Arg His Val Pro Leu Val Pro Pro 
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995 1000 1005 

Ala Leu Gly Ala Asn Tyr Phe Ser Ser lie Arg Gin Pro Val Val Gin 

1010 1015 1020 

His Ala Arg Glu Ser Ala Ala Gly Glu Asn Ala Leu Thr Tyr Ala Leu 
5 1025 1030 1035 104 

Met Ala Gly Tyr Phe Lys Met Ser Pro Val Tyr His Gin Leu Lys Thr 

1045 1050 1055 

Gly Leu His Pro Gly Phe Gly Phe Thr Val Val Arg Gin Asp Arg Phe 
1060 1065 1070 

10 Val Thr Glu Asn Val Leu Phe Ser Ala Ser Glu Ala Tyr Phe Leu Gly 
1075 1080 1085 

Gin Leu Gin Val Ala Arg His Glu Thr Gly Gly Gly Val Ser Phe Thr 

1090 1095 1100 

Leu Thr Gin Pro Arg Gly Asn Val Asp Leu Gly Val Gly Tyr Thr Ala 
15 1105 1110 1115 112 

Val Ala Ala Thr Ala Thr Val Arg Asn Pro Val Thr Asp Met Gly Asn 

1125 1130 1135 

Leu Pro Gin Asn Phe Tyr Leu Gly Arg Gly Ala Pro Pro Leu Leu Asp 
1140 1145 1150 

20 Asn Ala Ala Ala Val Tyr Leu Arg Asn Ala Val Val Ala Gly Asn Arg 
1155 1160 1165 

Leu Gly Pro Ala Gin Pro Leu Pro Val Phe Gly Cys Ala Gin Val Pro 

1170 1175 1180 

Arg Arg Ala Gly Met Asp His Gly Gin Asp Ala Val Cys Glu Phe He 
25 1185 1190 1195 120 

Ala Thr Pro Val Ala Thr Asp He Asn Tyr Phe Arg Arg Pro Cys Asn 

1205 1210 1215 

Pro Arg Gly Arg Ala Ala Gly Gly Val Tyr Ala Gly Asp Lys Glu Gly 
1220 1225 1230 

30 Asp Val He Ala Leu Met Tyr Asp His Gly Gin Ser Asp Pro Ala Arg 
1235 1240 1245 

Pro Phe Ala Ala Thr Ala Asn Pro Trp Ala Ser Gin Arg Phe Ser Tyr 

1250 1255 1260 

Gly Asp Leu Leu Tyr Asn Gly Ala Tyr His Leu Asn Gly Asp Val Leu 
35 1265 1270 1275 128 

Ser Pro Cys Phe Lys Phe Phe Thr Ala Ala Asp He Thr Ala Lys His 

1285 1290 1295 

Arg Cys Leu Glu Arg Leu He Val Glu Thr Gly Ser Ala Val Ser Thr 
1300 1305 1310 

40 Ala Thr Ala Ala Ser Asp Val Gin Phe Lys Arg Pro Pro Gly Cys Arg 
1315 1320 1325 

Glu Leu Val Glu Asp Pro Cys Gly Leu Phe Gin Glu Ala Tyr Pro He 
1330 1335 1340 
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Thr Cys Ala Ser Asp Pro Ala Leu Leu Arg Ser Ala Arg Asp Gly Glu 
1345 1350 1355 136 

Ala His Ala Arg Glu Thr His~Phe Thr Gin Tyr Leu lie Tyr Asp Asp 
1365 1370 1375 

5 Leu Lys Gly Leu Ser Leu 
1380 



(2) INFORMATION FOR SEQ ID NO: 148: 



10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 



20 Met Thr Met Arg Asp Asp Val Pro Leu Leu Asp Arg Glu Leu Val Tyr 
15 10 15 

Glu Ala Ala Cys Gly Gly Glu Asp Gly Glu Leu Pro Leu Asp Glu Gin 

20 25 30 

Phe Ser Leu Ser Ser Tyr Gly Thr Ser Asp Phe Phe Val Ser Ser Ala 
25 35 40 45 

Tyr Ser Arg Leu Pro Pro His Thr Gin Pro Val Phe Ser Lys Arg Val 

50 55 60 

Val Met Phe Ala Trp Ser Phe Leu Val Leu Lys Pro Leu Glu Leu Val 
65 70 75 80 

30 Ala Ala Gly Met Tyr Tyr Gly Trp Thr Gly Arg Ala Val Ala Pro Ala 

85 90 95 

Cys lie lie Ala Ala Val Leu Ala Tyr Tyr Val Thr Trp Leu Ala Arg 

100 105 110 

Ala Leu Leu Leu Tyr Val Asn lie Lys Arg Asp Arg Leu Pro Leu Ser 
35 115 120 125 

Pro Pro Val Phe Trp Gly Leu Cys Val He Met Gly Gly Ala Ala Leu 

130 135 140 

Cys Ala Leu Val Ala Ala Ala His Glu Thr Phe Ser Pro Asp Gly Leu 
145 150 155 160 

40 Phe His Trp He Thr Ala Ser Gin Leu Leu Pro Arg Thr Asp Pro Leu 

165 170 175 

Arg Ala Arg Ser Leu Gly He Ala Cys Ala Ala Gly Ala Ala Met Trp 
180 185 190 
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5 



15 



Val Ala Ala Ala Asp Cys Phe Ala Ala Phe Thr Asn Phe Phe Leu Ala 

195 200 205 

Arg Phe Trp Thr Arg Ala lie- Leu Lys Ala Pro Val Ala Phe 
210 215 220 

(2) INFORMATION FOR SEQ ID NO: 149: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 627 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 



Val Gly Arg Gin Gly Glu Arg Trp Val Gly Gly Gly Asn Glu Glu Asn 
15 10 15 

20 Thr Gin Arg Ala Thr Ser Gly Met Arg Pro Glu Leu Ser Leu Lys Gly 
20 25 30 

Arg Pro Cys Val Thr Glu Ala Val Val Cys Pro Ser Thr Asp Ala Ala 

35 40 45 

lie His Ser Gly Gly Ser Ser Ser Val Arg Pro Gin Pro Tyr Ala Arg 
25 50 55 60 

Ala Ala Arg Ala Arg Ala Thr His Gly Ser Arg Ser Arg His Arg Gin 
65 70 75 80 

Pro Leu Leu Pro Pro Pro Ser Ser His His Pro Thr lie Pro Pro Pro 
85 90 95 

30 Pro Ser Pro Pro Arg Gly Ser Pro Ala Met Glu Leu Ser Tyr Ala Thr 
100 105 110 

Thr Leu His His Arg Asp Val Val Phe Tyr Val Thr Ala Asp Arg Asn 

115 120 125 

Arg Ala Tyr Phe Val Cys Gly Gly Ser Val Tyr Ser Val Gly Arg Pro 
35 130 135 140 

Arg Asp Ser Gin Pro Gly Glu lie Ala Lys Phe Gly Leu Val Val Arg 
145 150 155 160 

Gly Thr Gly Pro Lys Asp Arg Met Val Ala Asn Tyr Val Arg Ser Glu 
165 170 175 

40 Leu Arg Gin Arg Gly Leu Arg Asp Val Arg Pro Val Gly Glu Asp Glu 
180 185 190 

Val Phe Leu Asp Ser Val Cys Leu Leu Asn Pro Asn Val Ser Ser Asp 
195 200 205 
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Val lie Asn Thr Asn Asp Val Glu Val Leu Asp Glu Cys Leu Ala Glu 

210 215 220 

Tyr Cys. Thr Ser Leu Arg Thr Ser Pro Gly Val Leu Val Thr Gly Val 
225 230 235 240 

5 Arg Val Arg Ala Arg Asp Arg Val lie Glu Leu Phe Glu His Pro Ala 

245 250 255 

lie Val Asn lie Ser Ser Arg Phe Ala Tyr Thr Pro Ser Pro Tyr Val 

260 265 270 

Phe Ala Gin Ala His Leu Pro Arg Leu Pro Ser Ser Leu Glu Pro Leu 
10 275 280 285 

Val Ser Gly Leu Phe Asp Gly lie Pro Ala Pro Arg Gin Pro Leu Asp 

290 295 300 

Ala Arg Asp Arg Arg Thr Asp Val Val lie Thr Gly Thr Arg Ala Pro 
305 310 315 320 

15 Arg Pro Met Ala Gly Thr Gly Ala Gly Gly Ala Gly Ala Lys Arg Ala 

325 330 335 

Thr Val Ser Glu Phe Val Gin Val Lys His lie Asp Arg Val Val Ser 

340 345 350 

Pro Ser Val Ser Ser Ala Pro Pro Pro Ser Ala Pro Asp Ala Ser Leu 
20 355 360 365 

Pro Pro Pro Gly Leu Gin Glu Ala Ala Pro Pro Gly Pro Pro Leu Arg 

370 375 380 

Glu Leu Trp Trp Val Phe Tyr Ala Gly Asp Arg Ala Leu Glu Glu Pro 
385 390 395 400 

25 His Ala Glu Ser Gly Leu Thr Arg Glu Glu Val Arg Ala Val His Gly 

405 410 415 

Phe Arg Glu Gin Ala Trp Lys Leu Phe Gly Ser Val Gly Ala Pro Arg 

420 425 430 

Ala Phe Leu Gly Ala Ala Leu Ser Pro Thr Gin Lys Leu Ala Val Tyr 
30 435 440 445 

Tyr Tyr Leu lie His Arg Glu Arg Arg Met Ser Pro Phe Pro Ala Leu 

450 455 460 

Val Arg Leu Val Gly Arg Tyr lie Gin Arg His Gly Val Pro Ala Pro 
465 470 475 480 

35 Asp Glu Pro Thr Leu Ala Asp Ala Met Asn Gly Leu Phe Arg Asp Ala 

485 490 495 

Ala Gly Thr Val Ala Glu Gin Leu Leu Met Phe Asp Leu Leu Pro Pro 

500 505 510 

Lys Asp Val Pro Val Gly Ser Asp Ala Arg Ala Asp Ser Ala Ala Leu 
40 515 520 525 

Leu Arg Phe Val Asp Ser Gin Arg Leu Thr Pro Gly Gly Ser Val Ser 

530 535 540 

Pro Glu His Val Met Tyr Leu Gly Ala Phe Leu Gly Val Leu Tyr Ala 
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545 550 555 560 

Gly His Gly Arg Leu Ala Ala Ala Thr His Thr Ala Arg Leu Thr Gly 

565 - 570 575 

Val Thr Ser Leu Val Leu Thr Val Gly Asp Val Asp Arg Met Ser Ala 
5 580 585 590 

Phe Asp Arg Gly Pro Ala Gly Ala Ala Gly Arg Thr Arg Thr Ala Gly 

595 600 605 

Tyr Leu Asp Ala Leu Leu Thr Val Cys Leu Ala Arg Ala Gin His Gly 
610 615 620 

10 Gin Ser Val 
625 

(2) INFORMATION FOR SEQ ID NO: 150: 

15 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 908 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

25 Val Ser He Ser Ala Gly Val Arg Gly Gin Gly Trp His Arg He Ser 
15 10 15 

Thr Pro Pro Lys Asn Gly Ala Gly Arg Ser Val Leu Val Phe Gly Leu 

20 25 30 

Val Leu Pro Leu Cys Phe Tyr Pro His Pro Thr Pro Ser Phe Gly Pro 
30 35 40 45 

Arg Leu Arg Gin Gin Arg Ala Ser Asp Ser Leu Arg Gly Ala Glu Pro 

50 55 60 

Leu Trp Ala Val Gly Thr Asp Thr Pro Pro Ser Ala Asp Trp Gin Pro 
65 70 75 80 

35 Gly Arg Thr Thr Met Gly Pro Gly Leu Trp Val Val Met Gly Val Leu 

85 90 95 

Val Gly Val Ala Gly Gly His Asp Thr Tyr Trp Thr Glu Gin He Asp 

100 105 110 

Pro Trp Phe Leu His Gly Leu Gly Leu Ala Arg Thr Tyr Trp Arg Asp 
40 115 120 125 

Thr Asn Thr Gly Arg Leu Trp Leu Pro Asn Thr Pro Asp Ala Ser Asp 

130 135 140 

Pro Gin Arg Gly Arg Leu Ala Pro Pro Gly Glu Leu Asn Leu Thr Thr 
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145 150 155 160 

Ala Ser Val Pro Met Leu Arg Trp Tyr Ala Glu Arg Phe Cys Phe Val 

165 - 170 175 

Leu Val Thr Thr Ala Glu Phe Pro Arg Asp Pro Gly Gin Leu Leu Tyr 
5 180 185 190 

He Pro Lys Thr Tyr Leu Leu Gly Arg Pro Arg Asn Ala Ser Leu Pro 

195 200 205 

Glu Leu Pro Glu Ala Gly Pro Thr Ser Arg Pro Pro Ala Glu Val Thr 
210 215 220 

10 Gin Leu Lys Gly Leu Ser His Asn Pro Gly Ala Ser Ala Leu Leu Arg 
225 230 235 240 

Ser Arg Ala Trp Val Thr Phe Ala Ala Ala Pro Asp Arg Glu Gly Leu 

245 250 255 

Thr Phe Pro Arg Gly Asp Asp Gly Ala Thr Glu Arg His Pro Asp Gly 
15 260 265 270 

Arg Arg Asn Ala Pro Pro Pro Gly Pro Pro Ala Gly Thr Pro Arg His 

275 280 285 

Pro Thr Thr Asn Leu Ser lie Ala His Leu His Asn Ala Ser Val Thr 
290 295 300 

20 Trp Leu Ala Arg Leu Leu Arg Thr Pro Gly Arg Tyr Val Tyr Leu Ser 
305 310 315 320 

Pro Ser Ala Ser Thr Trp Pro Val Gly Val Trp Thr Thr Gly Gly Leu 

325 330 335 

Ala Phe Gly Cys Asp Ala Ala . Leu Val Arg Ala Arg Tyr Gly Lys Gly 
25 340 345 350 

Phe Met Gly Leu Val He Ser Met Arg Asp Ser . Pro Pro Ala Glu He 

355. 360 365 

He Val Val Pro Ala Asp Lys Thr Leu Ala Arg Val Gly Asn Pro Thr 
370 375 380 

30 Asp Glu Asn Ala Pro Ala Val Leu Pro Gly Pro Pro Ala Gly Pro Arg 
385 390 395 400 

Tyr Arg Val Phe Val Leu Gly Ala Pro Thr Pro Ala Asp Asn Gly Ser 

405 410 415 

Ala Leu. Asp Ala Leu Arg Arg Val Ala Gly Tyr Pro Glu Glu Ser Thr 
35 420 425 430 

Asn Tyr Ala Gin Tyr Met Ser Arg Ala Tyr Ala Glu Phe Leu Gly Glu 

435 440 445 

Asp Pro Gly Ser Gly Thr Asp Ala Arg Pro Ser Leu Phe Trp Arg Leu 
450 455 460 

40 Ala Gly Leu Leu Ala Ser Ser Gly Phe Ala Phe Val Asn Ala Ala His 
465 470 475 480 

Ala His Asp Ala lie Arg Leu Ser Asp Leu Leu Gly Phe Leu Ala His 
485 490 495 
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Ser Arg Val Leu Ala Gly Leu Ala Arg Ala Ala Gly Cys Ala Ala Asp 

500 505 510 

Ser Val Phe Leu Asn Val Ser -Val Leu Asp Pro Ala Ala Arg Leu Arg 
515 520 525 

5 Leu Glu Ala Arg Leu Gly His Leu Val Ala Ala lie Arg Glu Gin Ser 
530 535 540 

Leu Ala Ala His Ala Leu Gly Tyr Gin Leu Ala Phe Val Leu Asp Ser 
545 550 555 560 

Pro Ala Ala Tyr Gly Ala Val Ala Pro Ser Ala Ala Arg Leu lie Asp 
10 565 570 575 

Ala Leu Tyr Ala Glu Phe Leu Gly Gly Arg Ala Leu Thr Ala Pro Met 

580 585 590 

Val. Arg Arg Ala Leu Phe Tyr Ala Thr - Ala Val Leu Arg Ala Pro Phe 
595 . 600 605 

15 Leu Ala Gly Ala Pro Ser Ala Glu Gin Arg Glu Arg Ala Arg Arg Gly 
610 615 620 

Leu Leu lie Thr Thr Ala Leu Cys Thr Ser Asp Val Ala Ala Ala Thr 
625- 630 635 640 

His Ala Asp Leu Arg Ala Ala Arg Thr Asp His Gin Lys Asn Leu Phe 
20 645 650 655 

Trp Leu Pro Asp His Phe Ser Pro Cys Ala Ala Ser Leu Arg Phe Asp 

660 665 670 

Leu Ala Glu Gly Gly Phe lie Leu Asp Ala Met Ala Thr Arg Ser Asp 
675 680 685 

25 lie Pro Ala Asp Val Met Ala Gin Gin Thr Arg Gly Val Ala Ser Val 
690 695 700 

Leu Thr Arg Trp Ala His Tyr Asn Ala Leu He Arg Ala Phe Val Pro 
705 710 715 720 

Glu Ala Thr His Gin Cys Ser Gly Pro Ser His Asn Ala Glu Pro Arg 
30 725 730 735 

He Leu Val Pro He Thr His Asn Ala Ser Tyr Val Val Thr His Thr 

740 745 750 

Pro Leu Pro Arg Gly He Gly Tyr Lys Leu Thr Gly Val Asp Val Arg 
755 760 765 

35 Arg Pro Leu Phe He Thr Tyr Leu Thr Ala Thr Cys Glu Gly His Ala 
770 775 780 

Arg Glu He Glu Pro Lys Arg Leu Val Arg Thr Glu Asn Arg Arg Asp 
785 790 795 800 

Leu Gly Leu Val Gly Ala Val Phe Leu Arg Tyr Thr Pro Ala Gly Glu 
40 805 810 815 

Val Met Ser Val Leu Leu Val Asp Thr Asp Ala Thr Gin Gin Gin Leu 

820 825 830 

Ala Gin Gly Pro Val Ala Gly Thr Pro Asn Val Phe Ser Ser Asp Val 
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10 



20 



835 840 845 

Pro Ser Val Leu Leu Phe Pro Asn Gly Thr Val lie His Leu Leu Ala 

850 855 860 

Phe Asp Thr Leu Pro lie Ala Thr lie Ala Pro Gly Phe Leu Ala Ala 
865 870 875 880 

Ser Ala Leu Gly Val Val Met lie Thr Ala Ala Gly He Leu Arg Val 

885 890 895 

Val Arg Thr Cys Val Pro Phe Leu Trp Arg Arg Glu 
900 905 

(2) INFORMATION FOR SEQ ID NO: 151: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 370 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:151: 



Met Ala Ser His Ala Gly Gin Gin His Ala Pro Ala Phe Gly Gin Ala 
15 10 15 

25 Ala Arg Ala Ser Gly Pro Thr Asp Gly Arg Ala Ala Ser Arg Pro Ser 
20 25 30 

His Arg Gin Gly Ala Ser Asp Pro Glu Leu Pro. Thr Leu Leu Arg Val 

35 40 45 

Tyr He Asp Gly Pro His Gly Val Gly Lys Thr Thr Thr Ser Ala Gin 
30 50 55 60 

Leu Met Glu Ala Leu Gly Pro Arg Asp Asn He Val Tyr Val Pro Glu 
65 70 75 80 

Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr Leu Thr Asn 
85 90 95 

35 He Tyr Asn Thr Gin His Arg Leu Asp Arg Gly Glu He Ser Ala Gly 
100 105 110 

Glu Ala Ala Val Val Met Thr Ser Ala Gin He Thr Met Ser Thr Pro 

115 120 125 

Tyr Ala Ala Thr Asp Ala Val Leu Ala Pro His He Gly Gly Glu Ala 
40 130 135 140 

Val Gly Pro Gin Ala Pro Pro Pro Ala Leu Thr Leu Val Phe Asp Arg 
145 150 155 160 

His Pro lie Ala. Ser Leu Leu Cys Tyr Pro Ala Ala Arg Tyr Leu Met: 
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165 170 175 

Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Met Pro Pro Thr 

180 - 185 190 

Ala Pro Gly Thr Asn Leu Val Leu Gly Val Leu Pro Glu Ala Glu His 
5 195 200 205 

Ala Asp Arg Leu Ala Arg Arg Gin Arg Pro Gly Glu Arg Leu Asp Leu 

210 215 220 

Ala Met Leu Ser Ala He Arg Arg Val Tyr Asp Leu Leu Ala Asn Thr 
225 230 235 240 

10 Val Arg Tyr Leu Gin Arg Gly Gly Arg Trp Arg Glu Asp Trp Gly Arg 

245 250 255 

Leu Thr Gly Val Ala Ala Ala Thr Pro Arg Pro Asp Pro Glu Asp Gly 

260 265 270 

Ala Gly Ser Leu Pro Arg He Glu Asp Thr Leu Phe Ala Leu Phe Arg 
15 275 280 285 

Val Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu Tyr His He Phe Ala 

290 295 300 

Trp Val Leu Asp Val Leu Ala Asp Arg Leu Leu Pro Met His Leu Phe 
305 310 315 320 

20 Val Leu Asp Tyr Asp Gin Ser Pro Val Gly Cys Arg Asp Ala Leu Leu 

325 330 335 

Arg Leu Thr Ala Gly Met He Pro Thr Arg Val Thr Thr Ala Gly Ser 

340 345 350 

He Ala Glu He Arg Asp Leu Ala Arg Thr Phe Ala Arg Glu Val Gly 
25 355 360 365 

Gly Val 
370 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 152: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 352 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:152: 



Val Leu Arg Val Val Asp Val Arg Gin Gly Leu Gly Gly Pro Gin His 

15 10 15 

Leu Pro Val Ser His Arg Leu Gly Asp Val Asp Asp He Val Ala Arg 
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20 .25 30 

Pro Gin Gly Leu His Gin Leu Arg Gly Gly Gly Gly Leu Pro His Pro 

35 ,40 45 

Val Gly Ser Val Tyr lie Asn Pro Gin Gin Arg Gly Gin Leu Arg lie 
5 50 55 60 

Pro Ala Gly Phe Gly Gly Pro Leu Ala Met Ala Arg Thr Gly Arg Arg 
65 70 75 80 

Ala Ala Val Gly Arg Pro Ala Arg Thr Ser Ser Leu Thr Glu Arg Arg 
85 90 95 

10 Arg Val Leu Leu Ala Gly Val Arg Ser His Thr Arg Phe Tyr Lys Ala 
100 105 110 

Phe Ala Arg Glu Val Arg Glu Phe Asn Ala Thr Arg lie Cys Gly Thr 

115 120 125 

Leu Leu Thr Leu Met Ser Gly Ser Leu Gin Gly Arg Ser Leu Phe Glu 
15 130 135 140 

Ala Thr Arg Val Thr Leu lie Cys Glu Val Asp Leu Gly Pro Arg Arg 
145 150 155 160 

Pro Asp Cys lie Cys Val Phe Glu Phe Ala Asn Asp Lys Thr Leu Gly 
165 170 175 

20 Gly Val Cys Val lie Leu Lys Thr Cys Lys Ser lie Ser Ser Gly Asp 
180 185 190 

Thr Ala Ser Lys Arg Glu Gin Arg Thr Thr Gly Met Lys Gin Leu Arg 

195 200 205 

His Ser Leu Lys Leu Leu Gin Ser Leu Ala Pro Pro Gly Asp Lys Val 
25 210 215 220 

Val Tyr Leu Cys Pro lie Leu Val Phe Val Ala Gin Arg Thr Leu Arg 
225 230 235 240 

Val Ser Arg Val Thr Arg Leu Val Pro Gin Lys He Ser Gly Asn He 
245 250 255 

30 Thr Ala Ala Val Arg Met Leu Gin Ser Leu Ser Thr Tyr Ala Val Pro 
260 265 270 

Pro Glu Pro Gin Thr Arg Arg Ser Arg Arg Arg Val Ala Ala Thr Ala 

275 280 285 

Arg Pro Gin Arg Pro Pro Ser Pro Thr Arg Asp Pro Glu Gly Thr Ala 
35 290 295 300 

Gly His Pro Ala Pro Pro Glu Ser Asp Pro Pro Ser Pro Gly Val Val 
305 310 315 320 

Gly Val Ala Ala Glu Gly Gly Gly Val Leu Gin Lys He Ala Ala Leu 
325 330 335 

40 Phe Cys Val Pro Val Ala Ala Lys Ser Arg Pro Arg Thr Lys Thr Glu 
340 345 350 



(2) INFORMATION FOR SEQ ID NO: 153: 

435 



WO 98/20016 



PCT/US97/20016 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 571 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 



Met Asp Pro Tyr Tyr Pro Phe Asp Ala Leu Asp Val Trp Glu His Arg 

1.5 10 15 

Arg Phe lie Val Ala Asp Ser Arg Ser Phe He Thr Pro Glu Phe Pro 
15 20 25 30 

Arg Asp Phe Trp Met Leu Pro Val Phe Asn He Pro Arg Glu Thr Ala 

35 40 45 

Ala Glu Arg Ala Ala Val Leu Gin Ala Gin Arg Thr Ala Ala Ala Ala 
50 55 60 

20 Ala Leu Glu Asn Ala Ala Leu Gin Ala Ala Glu Leu Pro Val Asp He 
65 70 75 80 

Glu Arg Arg He Arg Pro He Glu Gin Gin Val His His He Ala Asp 

85 90 95 

Ala Leu Glu Ala Leu Glu Thr Ala Ala Ala Ala Ala Glu Glu Ala Asp 
25 100 105 110 

Ala Ala Arg Asp Ala Glu Arg Glu Gly Ala Ala Asp Gly Ala Ala Pro 

115 120 125 

Ser Pro Thr Ala Gly Pro Ala Ala Ala Glu Met Glu Val Gin He Val 
130 135 140 

30 Arg Asn Asp Pro Pro Leu Arg Tyr Asp Thr Asn Leu Pro Val Asp Leu 
145 150 155 160 

Leu His Met Val Tyr Ala Gly Arg Gly Ala Ala Gly Ser Ser Gly Val 

165 170 175 

Val Phe Gly Thr Trp Tyr Arg Thr He Gin Glu Arg Thr He Ala Asp 
35 180 185 190 

Phe Pro Leu Thr Thr Arg Ser Ala Asp Phe Arg Asp Gly Arg Met Ser 

195 200 205 

Lys Thr Phe Met Thr Ala Leu Val Leu Ser Leu Gin Ser Cys Gly Arg 
210 215 220 

40 Leu Tyr Val Gly Gin Arg His Tyr Ser Ala Phe Glu Cys Ala Val Leu 
225 230 235 240 

Cys Leu Tyr Leu Leu Tyr Arg Thr Thr His Glu Ser Ser Pro Asp Arg 
245 250 255 
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Asp Arg Ala Pro Val Ala Phe Gly Asp Leu Leu Ala Arg Leu Pro Arg 

260 265 270 

Tyr Leu Ala Arg Leu Ala Ala-.Val lie Gly Asp Glu Ser Gly Arg Pro 
275 280 285 

5 Gin Tyr Arg Tyr Arg Asp Asp Lys Leu Pro Lys Ala Gin Phe Ala Ala 
290 295 300 

Ala Gly Gly Arg Tyr Glu His Gly Ala Thr His Val Val lie Ala Thr 
305 310 315 320 

Leu Val Arg His Gly Val Leu Pro Ala Ala Pro Gly Asp Val Pro Arg 
10 325 330 335 

Asp Thr Ser Thr Arg Val Asn Pro Asp Asp Val Ala His Arg Asp Asp 

340 345 350 

Val Asn Arg Ala Ala Ala Ala Phe Leu Arg His Asn Leu Phe Leu Trp 
355 360 365 

15 Glu Asp Gin Thr Leu Leu Arg Ala Thr Ala Asn Thr lie Thr Ala Val 
370 375 380 

Leu Arg Arg Leu Leu Ala Asn Gly Asn Val Tyr Ala Asp Arg Leu Asp 
385 390 395 400 

Asn Arg Leu Gin Leu Gly Met Leu He Pro Gly Ala Val Pro Ala Glu 
20 405 410 415 

Ala He Arg Ala Ser Gly Leu Asp Ser Gly Ala He Lys Ser Gly Asp 

420 425 430 

Asn Asn Leu Glu Ala Leu Cys Val Asn Tyr Val Leu Pro Leu Tyr Gin 
435 . 440 445 

25 Ala Asp Pro Thr Val Glu Leu Thr Gin Leu Phe Pro Gly Leu Ala Ala 
450 455 460 

Leu Cys Leu Asp Ala Gin Ala Gly Arg Pro Leu Ala Ser Thr Arg Arg 
465 470 475 480 

Val Val Asp Met Ser Ser Gly Ala Arg Gin Ala Ala Leu Val Arg Leu 
30 485 490 495 

Thr Ala Leu Glu Leu He Asn Arg Thr Arg Thr Asn Thr Thr Pro Val 

500 505 510 

Gly Glu He He Asn Ala His Asp Ala Leu Gly He Gin Tyr Glu Gin 
515 520 525 

35 Gly Leu Gly Leu Leu Ala Gin Gin Ala Arg He Gin Ala Lys Arg Phe 
530 535 540 

Ala Thr Phe Asn Val Gly Ser Asp Tyr Asp Leu Leu Tyr Phe Leu Cys 
545 550 555 560 

Leu Gly Phe He Pro Gin Tyr Leu Ser Val Ala 
40 565 570 



(2) INFORMATION FOR SEQ ID NO: 154: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 571 amino acids. 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 



Met Asp Pro Tyr Tyr Pro Phe Asp Ala Leu Asp Val Trp Glu His Arg 

15 10 15 

Arg Phe lie Val Ala Asp Ser Arg Ser Phe lie Thr Pro Glu Phe Pro 
20 25 30 

15 Arg Asp Phe Trp Met Leu Pro Val Phe Asn lie Pro Arg Glu Thr Ala 
35 40 45 

Ala Glu Arg Ala Ala Val Leu Gin Ala Gin Arg Thr Ala Ala Ala Ala 

50 55 60 

Ala Leu Glu Asn Ala Ala Leu Gin Ala Ala Glu Leu Pro Val Asp He 
20 65 70 75 80 

Glu Arg Arg lie Arg Pro He Glu Gin Gin Val His His He Ala Asp 

85 90 95 

Ala Leu Glu Ala Leu Glu Thr Ala Ala Ala Ala Ala Glu Glu Ala Asp 
100 105 110 

25 Ala Ala Arg Asp Ala Glu Arg Glu Gly Ala Ala Asp Gly Ala Ala Pro 
115 120 125 

Ser Pro Thr Ala Gly Pro Ala Ala Ala Glu Met Glu Val Gin He Val 

130 135 140 

Arg Asn Asp Pro Pro Leu Arg Tyr Asp Thr Asn Leu Pro Val Asp Leu 
30 145 150 155 160 

Leu His Met Val Tyr Ala Gly Arg Gly Ala Ala Gly Ser Ser Gly Val 

165 170 175 

Val Phe Gly Thr Trp Tyr Arg Thr He Gin Glu Arg Thr He Ala Asp 
180 185 190 

35 Phe Pro Leu Thr Thr Arg Ser Ala Asp Phe Arg Asp Gly Arg Met Ser 
195 200 205 

Lys Thr Phe Met. Thr Ala Leu Val Leu Ser Leu Gin Ser Cys Gly Arg 

210 215 220 

Leu Tyr Val Gly Gin Arg His Tyr Ser Ala Phe Glu Cys Ala Val Leu 
40 225 230 235 240 

Cys Leu Tyr Leu Leu Tyr Arg Thr Thr His Glu Ser Ser Pro Asp Arg 

245 250 255 

Asp Arg Ala Pro Val Ala Phe Gly Asp Leu Leu Ala Arg Leu Pro Arg 
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260 265 270 

Tyr Leu Ala Arg Leu Ala Ala Val lie Gly Asp Glu Ser Gly Arg Pro 

275 -280 285 

Gin Tyr Arg Tyr Arg Asp Asp Lys Leu Pro Lys Ala Gin Phe Ala Ala 
5 290 295 300 

Ala Gly Gly Arg Tyr Glu His Gly Ala Thr His Val Val He Ala Thr 
305 310 315 320 

Leu Val Arg His Gly Val Leu Pro Ala Ala Pro Gly Asp Val Pro Arg 
325 330 335 

10 Asp Thr Ser Thr Arg Val Asn Pro Asp Asp Val Ala His Arg Asp Asp 
340 345 350 

Val Asn Arg Ala Ala Ala Ala Phe Leu Arg His Asn Leu Phe Leu Trp 

355 360 365 

Glu Asp Gin Thr Leu Leu Arg Ala Thr Ala Asn Thr He Thr Ala Val 
15 370 375 380 

Leu Arg Arg Leu Leu Ala Asn Gly Asn Val Tyr Ala Asp Arg Leu Asp 
385 390 395 400 

Asn Arg Leu Gin Leu Gly Met Leu He Pro Gly Ala Val Pro Ala Glu 
405 410 415 

20 Ala He Arg Ala Ser Gly Leu Asp Ser Gly Ala He Lys Ser Gly Asp 
420 425 430 

Asn Asn Leu Glu Ala Leu Cys Val Asn Tyr Val Leu Pro Leu Tyr Gin 

435 440 445 

Ala Asp Pro Thr Val Glu Leu Thr Gin Leu Phe Pro Gly Leu Ala Ala 
25 450 455 460 

Leu Cys Leu Asp Ala Gin Ala Gly Arg Pro Leu Ala Ser Thr Arg Arg 
465 470 475 480 

Val Val Asp Met Ser Ser Gly Ala Arg Gin Ala Ala Leu Val Arg Leu 
485 490 495 

30 Thr Ala Leu Glu Leu He Asn Arg Thr Arg Thr Asn Thr Thr Pro Val 
500 505 510 

Gly Glu He He Asn Ala His Asp Ala Leu Gly He Gin Tyr Glu Gin 

515 520 525 

Gly Leu Gly Leu Leu Ala Gin Gin Ala Arg He Gin Ala Lys Arg Phe 
35 530 535 540 

Ala Thr Phe Asn Val Gly Ser Asp Tyr Asp Leu Leu Tyr Phe Leu Cys 
545 550 555 560 

Leu Gly Phe He Pro Gin Tyr Leu Ser Val Ala 
565 570 



40 



(2) INFORMATION FOR SEQ ID NO: 155: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 11706 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:155: 





GGACGAACCA 


ACGGACACCT 


CCGCAAAGCG 


CGCGCGCGCC 


TCCCCCGCGG 


CGTCGCGACA 


60 


10 


GACCAGATAC 


AGCAGGGCGT 


GGAGGCAGTC 


GCGCGTGCGC 


GGGGGCAGCC 


ATACCGCGTA 


120 




TAGGGTAATG 


GCGCTGACGC 


TCTCCTCCAC 


CCAAACGATG 


CCGGGGGCTT 


CCATGCCACG 


180 




ACGCCCGGGG 


GTTGCCGTGT 


ATCGAACGAG 


CGCGGCCCCA 


GACTTATAGG 


GTGCTAAAGT 


240 




TCACCGCCCC 


CTGCATCATG 


GGCCAGGCCT 


CGGTGGGAAG 


CTCCGACAGA 


G.CCGCCTCGA 


300 




GAATGATGTC 


AGTGTTGGGC 


TGGGCGCCGG 


AGGCGTGCGT 


GCGCAAGCAG 


CGCCCCCACG 


360 


15 


CGGGCGCGCG 


CAGCTTGAAG 


CGCGCGCCCG 


CAAACTCCCG 


CTTATGGGCC 


ATCAGCAGCG 


420 




CGTACAGCTG 


TCTGTGCGTC 


CGGCAGGCGC 


TGTGGTCGAT 


GCGGTGGGCG 


TCCAGCAGCT 


480 




CCACGATGGC 


TCGCTTGGTG 


AGGTTTTTAA 


CGCGCCCCGC 


CCCGGGAAAC 


GTCTGCGTGC 


540 




TCTTGGCCAG 


CTGCACCCCG 


AACAGTTCGC 


CCCAGATGAT 


CTTGAACAGC 


GACAGCGCGT 


600 




GCTCCGTCTC 


GCTCACGGAC 


CCGCGCGGGG 


GGCAGCCGCT 


CAGGGCGTCG 


GCCACGCGCT 


660 


20 


TAACCGCGTC 


CTCCGACAGC 


AAGGGGCCGT 


CGGTCACGTT 


ACAGTGGCCC 


AGTTCGAACA 


720 




CCAGCTGCAT 


GTAGCGGTCG 


TAGTGGGGGT 


TCAGCAGCTC 


CAGCACGTCC 


TCGGGGCTAA 


780 




AGGTTCGCCC 


CGACCCCCCG 


GCCATCGAGT 


CCCACTGCAG 


GCACGCGGCC 


ATGGTGCTGC 


840 




ACAGACGGAA 


CAGCTCCCAG 


ACGGGGGCGA 


CGTTTAGGGT 


GGGGTGTAGG 


GCCACAAGCT 


900 




CCAGCTCTCC 


GGCGGCGTTG 


ATCGTGGGGA 


TGACGCCCGT 


GGCGTAGTGG 


TCGTAAAGCC 


960 


25 


GCCGGAAGAT 


GGCGCTGCTA 


TGGGCGGCCA 


TGGGGACGCG 


AAGACAGGCC 


TCCAGCAGCA 


1020 




CCAGGTAGAT 


GAACCGCGTG 


CGGCCGACCA 


GGCTGTTGAG 


GCCGCGCATG 


AGCGCGACCA 


1080 




CCTCGGCCGG 


CGCGACGTCC 


GGCCGGAGGT 


ACTTTTCGAC 


GAAAAGGCCC 


ACCTCCTCCG 


1140 




TCTCGGCGGC 


CTGGGCCGAC 


AGGGACGTGT 


CGGGGTCCTG 


GCAGCGCAGC 


TCCCGCAGAT 


1200 




CCCGCTGGGC 


CCTCAGGGCA 


TCAAAATGTA 


TCCCCCGCAA 


AAACAGACAA 


AAGTTCCTCG 


1260 


30 


GGGTCAGCGC 


GGCGTCGTGG 


CCCCAGAACC 


GCACGTGCAT 


GCAGTTGAGG 


GTCAGAAGCA 


1320 




TGTGGAGGAT 


GTTAAGACTG 


TCCGCGAGGC 


ACGCCAGCGT 


GCACCTCTCG 


AAGTAGTGCT 


1380 




TGTACCGGAA 


TTTGCTGTAG 


ATGCGCGACC 


CCCGCGCCTG 


CGCCGCGTCG 


GCGTGCGACG 


1440 




CGTCGCAGCG 


CCCTTTGAAC 


CGGCGGCACA 


ACAGGTTCGT 


CACCTGGGAA 


AACTGTGCCG 


1500 




GCCACTGCCC 


GCTGGCGCTC 


ACCACGTGGT 


TGAGCAGCAT 


GGGCGTAAAG 


ACGGGCTCCG 


1560 


35 


AGCGCGCCCC 


GGACCCGTCC 


ATGTAGATCA 


GCAGCTCCCC 


CTTGCGGAGA 


GTCCGTACCC 


1620 




GCCCCAGCGA 


CTGGTACACG 


GACACCATGT 


CCGGCCCGTA 


GTTCATGGGT 


TTCACGTAGG 


1680 




CGAACATGCT 


GTCAAAGTGC 


GGCGGATCGA 


AGCTAAGGCC 


CACCGTCACG 


ACCGTTGTGT 


1740 




AGATGACCAC 


CCGGTACCGG 


CCCCATGTGG 


TCACGTCGCC 


GGGCGGGGTG 


AGCGAGTGGA 


1800 




GCAGCAGCAC 


GCGGTCCGTA 


AACTGCCGGC 


AGAACCTGGC 


AACGACCTCC 


GCGAAGGAGA 


1860 


40 


CCGTCGACGA 


GAAGATGCAG 


ACGTTATCTC 


CGCCGGCCAG 


GCGCGCCTCC 


AGCTCCCCGA 


1920 




AGAAGGTGGC 


GTCCGGGGGG 


GCGTCCGGGG 


GGGGCGCCCC 


GCCCGCCGGC 


CCCGGCGGGC 


1980 




GCAGGGCCGC 


CTGCAGGACC 


TCGGGCCCCA 


GGCGCGGGAG 


AAACAGACAA 


CGGCGCGCCG 


2040 




AAAATCCGGG 


CATGGCATAC 


TCCCCGATGA 


CCACGTGAAC 


GTTCTTTTCG 


CCCCGGAGGC 


2100 
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TGCACAGAAA GTCCACCAGC TGCGCGTTGG 
ACGTGCGCAG CAGGCGCAGC ATCAACGCGT 
AGTACAGTTG GCCCAACGTC GACATGACTT 
ACAGGTTCGG GCCCACGCGA TGAAGACTTT 
5 GGTCGTTCAT GATGTAATTG GTGGATGAGA 
CAGCGAACCG CGTCGCCAGG GTCTGAGTAA 
TCGTGTCCGG AGAGTGGATC GCTTCCCCCA 
AGCCCATTGG CGCGCGGACC ACAGTTACGC 
AGGTGACGGG TCCGTGTTGC TGCCGCTCGA 

10 ACTCGGCCAA ATCCCCCCCG TAAAGCATCC 
CGCACTCGTC ATCCCCGATG GGACGCCGGG 
GCGCCGACGT CGCGCCCGGG GCGCGGGCGG 
CGTTCATCGT CTCGGCCATC TGCGTCGTGG 
ACAGGTTTCC CTGCCTTTAC GCCCCCGCGA 

15 AGGTGCGCGG GGGTGTAGCC GTCCCCCTCC 
ACGCAATTAC GTCTACGCTG TTGCTGGCGG 
CCTCGCGCTA CGAGCGCGCG CTGGATGCGG 
TGCCACACGC CACGCTAATC GCCGGAAACG 
TGCTGCTGGC CCACCGCATC AGCCAGCTGG 

20 GCCTCGTGTA TCTCGCGGCC CATTTTTGCA 
GTCAGGTTCA CGGCCTGATT GACCCGGCGC 
GGGCAGTAAT GACAAACGCC TTATTACTGG 
TCTCGTTGAA CACGATCGCC GCCCTCAACT 
GCCTGACGAC GCTGTTCGCC CTGCTTGTCG 

25 TGTGTCACTA CGTGCGCGTG TTGGTGGGCC 
TCGTCGGCCT GGCCTGCGAG CACTACCACA 
GGCCGGGGGC CCAGACGGGA GTCCGCGTCG 
CCATGGCCGT GCTTCGGTGC ACGCGCGCCT 
TTTTCGTGCG CATGCGCGAC ACCCGGCACC 

30 GCTCCATGCG CGGTTCTAGG CGTGGCGGGC 
CCTACGCGAG CGTGTCCCAC CACGCCGAGA 
CGATCTACGA CGAAGTGGCC CCCGACCACG 
CCGGGCCTGT GCCCGACGCC GAGCCCATTT 
CCGCGGGGGA GCCGGTGTAC AGCACCGTTC 

35 AAACCGACGT TGTGCGTTTC ACCATACTTC 
TGTTTATTTT CCCCCACCCC TTCCTTTTCT 
TATACAACAA AAAATACCAC ACATACGACC 
GCTGTCAGAG AGTGGGGGCG TGAGCGTGGC 
TCTGGTGTGA CGCGATGGGG GGTCCGATGC 

40 GACCACGCGC ATGTCGGGGG GCACGTAGAA 
GACGTCAAAT TCGTGGGCGG TCAGCGAGAC 
GTGTCGGCAG CAGCAGGGCC GCGCCCCGGA 
TCGTCGAAGG CCAGGCGGCT GTTTCGCCGG 
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CGGTGGCGTC 


CATGGCGATG 


ATCCGCGGGC 


2160 


CGACGCGGCC 


CAGCTGCTGC 


ATCGTCGGCG 


2220 


CGTCCAGGAC 


GAGCACGTCG 


TAGTTGTTCA 


2280 


CCACCTGCAC 


GATGAGACGG 


TGGAAGGGGC 


2340 


AGTAGGTGAC 


GAAGTCGGGC 


AACCCTGACT 


2400 


AACTCCGACG 


ACAGGAGACG 


ACCAGCACAC 


2460 


ACCAGCGGAT 


CAGCGCGGTA 


GTTTTTCCCG 


2520 


ACCGGGCCGT 


CGGGGCGCTC 


GCGTCCGGGA 


2580 


TCGTTGTTTT 


CGGGTGGACC 


CGGGGAACCC 


2640 


GCGCCAGCGA 


TACACTCGAC 


GTGTACTGCT 


2700 


CCCCCAGGGG 


ATCCCCCGAG 


GCCGCGCCGG 


2760 


CGTGGTGGGT 


CTGGTGTGTG CAGGTGGCGA 


2820 


GGCTCCTGGT 


GCTGGCCTCT 


GTGTTCCGGG 


2880. 


CCTCTTATGC 


GGAGGCGAAC 


GCCACGGTCG 


2940 


GGTTGGACAC 


GCAGAGCCTG 


CTGGCCACGT 


3000 


CGGCCGTGTA 


CGCCGCGGTG 


GGCGCGGTGA 


3060 


CCCGTCGCCT 


GGCGGCGGCC 


CGTATGGCGA 


3120 


TCTGCGCGTG 


GCTGTTGCAG 


ATCACAGTCC 


3180 


CCCACCTTAT 


CTACGTCCTG 


CACTTTGCGT 


3240 


CCAGGGGGGT 


CCTGAGCGGG 


ACGTACCTGC 


3300 


CGACGCACCA 


TCGTATCGTC 


GGTCCGGTGC 


3360 


GCACCCTCCT 


GTGCACGGCC 


GCCGCCGCGG 


3420 


TCAACTTTTC 


CGCCCCGAGC 


ATGCTCATCT 


3480 


TGTCGCTGTT 


GTTGGTGGTC 


GAGGGGGTGC 


3540 


CCCACCTCGG 


GGCCATCGCC 


GCCACCGGCA 


3600 


CCGGTGGCTA 


CTACGTGGTG 


GAGCAGCAGT 


3660 


CCCTGGCGCT 


CGTCGCCGCC 


TTTGCCCTCG 


3720 


ACCTGTATCA 


CCGGCGACAC 


CACACTAAAT 


3780 


GCGCCCATTC 


GGCGCTTCGA 


CGCGTACGCA 


3840 


CGCCCGGAGA 


CCCGGGCTAC 


GCGGAAACCC 


3900 


TCGACCGGTA 


TGGGGATTCC 


GACGGGGACC 


3960 


AGGCCGAGCT 


CTACGCCCGA 


GTGCAACGCC 


4020 


ACGACACCGT 


GGAGGGGTAT 


GCGCCAAGGT 


4080 


GGCGATGGTA 


GCCGTTTCGT 


TCGTTTTAAT 


4140 


GGCGCGCGCG 


TGTGTGTGTT 


TTTTTTGTGG 


4200 


TTCGGCCACC 


ACCCCCCTCC 


TCCCCCGTAC 


4260 


AAATACGGAC 


AATCATTTCT 


GTCTTTATTC 


4320 


AGGAGGGCGG 


GCCACGTCGG 


GGTCCCGCCG 


4380 


GCGCCGGTAC 


TGGGGCCCCG 


GCGCCCGGGT 


4440 


GTTACCCTCT 


TCTTCGGACT 


CGATGTCCAC 


4500 


GACCTCCCCG 


CCGTCGGTGA 


TGATGACGTT 


4560 


GAACGCGAGG 


CCCATAACTT 


GGCGAGCGTA 


4620 


ATGTCCCGGT 


AGATCCCCGG 


CTCGACGCGG 


4680 
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ACGGGGGTGA TGATCAGGGC GATCGGAACG GCCTGGTCCG GGAGGATCGA TGCCTTGGCG 4740 

GGTCCGGGGG CCCCGCCACG CCCGGCGGGC GCTCCGCGGC CGTCCTCCAG GCGGAACGTC 4800 

ACGCCCTCCT CCGCGCCCGC GCGGTGCCTG CCGAGGAACG TCACCAGGTG CGGTTGCAGG 4860 

GGGCAGTCGG GAAAGTGGCT GTCGAGGACG TATCCCTGCA CCAAGATCTG TTTGAAGTTC 4920 

5 GGGTGGCGGG GGTTGGCGAA GATGGGCTCG CGGCGAACCA GCTCCCCGGA GCTCCAGGCC 4980 

ACGGGAGAGA TGGTGCGACG CTCGAGGTCG GGGACGCCAA ACAGAAGCAC CTCCGAGACA 5040 

ACGCCGCTAT TTAACTCCAC CAGCGCCCGA TCCGGGGCGG AGCATCGCCT TTTTTCGCCG 5100 

GCGGCGCGGG AATCGAGCCA GTCCCGGTCT TGGGTGACGA GCGCCTCCTC CGGGCCCGGG 5160 

ACGCGCCCGG GCGCGAAGTA GCGCACGCCG GGGTTGGGGA TGGACCGGAT GAACGCCCGG 5220 

10 AACGCCTCCG GCGATCGCCG CGCCATCAGG TCCTCGTACG CGGAGGCCGC GGGGGCGCCG 5280 

GGGTCCGCGG GGTCGAACGC GTACTTGGCT CGGCACTTAA CCTCGTAGAA GGCCAGGGGG 5340 

GTCTGGGGGG CGGGGGCCAG GTAGCCGTGA GGGTCCCTGG GGCACACGAG GATGTCCAGG 5400 

GACGCCCCCA CCATGCCCGT GTGGCCGTCC ATGAGGACCC CGCACGCGTG CACGTTCTCC 5460 

TCGGCGAGGT CCCCGGGTTG GTGAAAGACG AAGCGCCCGG CGTCGGCGTC GTCGTTGACG 5520 

15 CCCGCGTCCG CGCGGCCCAC GCAGTAGCGA AACAGCAGGT TTCGGGCCGT CGGCTCGTTC 5580 

ACCCGCCCGA ACATCACCGC CGACGACTGG GCGTCCAGCC GCAGGCTGGC GTTGTGGGTG 5640 

AGCCACTGGG ACGAGAAGCA CGGACCCTGC GCGCCCCACC GCAGCGTGGA GGCGGTCGTC 5700 

AGGCCCCGCC GAAGCAGGGC CCAGAGCTGG CAGTCGGCCT GGTTTTGCGT CGCCGCCTCG 5760 

TAAAATCCCA TAAGCGGGCG GGGGGCGACG GCTTCGGCGG CGGACGGGGG GGCGCGGCGC 5820 

20 GTCAGGCGCC AGAGGTGCCG GCCGAGCCCG CGGTCCACCA TGCCGGCCGC CTCCAGCGAC 5880 

ACGACGAGGG AGCACAGATA GTCCAGGCGA GCCCACAGGG GCCCGATGGC CAGAGGGGAG 5940 

CGGAGGCCGC GCAGCAGGCC GCGCAGGTGG CGCTCGAACG TTTCCGCCAA GATATGGGGG 6000 

GGCAGTGCGT TGGGGATCGC CGACGCCGAC CACATCGGGT CGGGGTCCGG GGGACCGGGG 6060 

CTGCAGTCCG GGTCGATGGC GTGTGCGCCC CCCGGCGAGA GGGGAATGTC GGGGGTTGGC 6120 

25 GGGCCGGATG AGGCCTCAGA GAGGGCCGGG GACGCGGGCC GGGCCTTTTC GCCCGGGGCC 6180 

CCGCCGTCGG GTTGCCCACG TGGGGGGCTC TGGGGCCAAT GGGAACCCGG GGCCCCCGGT 6240 

GACGTGGGGC GGGGTGGGGC GGGGC GGGGC CCAAAGACGG TCGCCAGATC TAGGCTGTTG 6300 

GGTCGGGGCC GCTTCGGGGG ACTATCGGGG TCGCGGGCGG GGTCCGCGGG GCGCTTGGCG 6360 

CCGGGTGTTG CGGCGGCCGC CATTTTTACG AGCAGCCGAA GAGCTCGAGG GCGGAAGGGA 6420 

30 TCCTCACGAC AGAGAGTGGC GCGCGGCCGG GTTGGCGTGA CAGAGGCGGG AGACCAGCAC 6480 

CAGCAGCGGC CTCAGCTCGG GCGGCAGCGA CACCGACGAC AGGACGGCCT TGTGCGTGCG 6540 

CTGGTAATTT ATACACTGCT CCGTGAACGC GCGCCGAATC TTGGGATTGC GAAGGTGGCG 6600 

CCGGATGCCC TCCGGCACGT CATACGCCAG GCCGTGGGTG TTGGTCTCGG CCGAGTTGAC 6660 

AAAGAGGGCG GGGTGCAGAA CGCAGCGATA GGCGAGGAGG GCCACGGCAA AGTCCGGCGA 6720 

35 GAGCTGGTTG TTAAAGTACT GGTAGCCCGG GACGCGGGTC ACGGGGACGC CCAGGCTCGG 6780 

GGCCACGTAC ACGCTAACCA GCAGCTCCAG CAGCGTCTGC CCCAGGGCGT AGAGATCGAC 6840 

CGCCAGCCCG ACGTCGTGCT TCAGGGGGCG GTTGTTAAAC TCGGCCCGCT CGTTGTTGAG 6900 

GTACTTTACC AAGAGCTCCG GCGGCTGGTT GTACCCGTGC CCCACCAGAG TGTGAAAGTT 6960 

GGCCGTGGTC AGGGCGGCGG GCATCCCAAA CCCCCGGGGG GACTCGAGGT CCGGCTCCTG 7020 

40 GAGGCAAAAC TGGCCCCGGG ATATCGTGGA GTTGGAGTTC AGGGTCACCA GGCTAAAGTC 7080 

GGCCAGGACG GCCCGCCGGA GCGACACCGC GTCCGATCGC AGCATCACGA GGACGTTGGC 7140 

GCACTTGATG TCCAGGTGGC TGATCCCGCA CCTGGTGTTC AGGAACACCA CGGCGCGCGC 72 00 

CAGGTCTGTG AAGCAGTGGT GGAGGGCCGT CGCGACGGAG GGGGTGGTCG CGCGCAGGGA 7260 
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CGCCAGCTGG CCGATGTACT TGCCGAGGTC CATGTCGTAC GCGGGGAACA CGATCTGGCG 7320 

CTGCTGCAGC GAGAACCCGA GCGGGGTGAT AAAGCCGCGG ATGTCGTGGG TGCGGCCGCC 7380 

GCGAAGAGCG CACTCCCCCA CGAGCAGGGT CGCGACGAGC TCCACGGCAA ACCACTCCTT 7440 

TTCCCGGATG GTCTTCACGG CGAGCTTGTG TTCGCGAATC AACTGCACCT CGCCGTACCC 7500 

5 CCCCGAGCCC CCGAAGCTGC GGGCCCCGGG GATCTCCAGG GTCGTGTAGC GGAGGGCGGG 7560 

GTTGACGGCG AATACGGGGA TGCATAGCTT GTGGATGCGC GCGAGGGACA GGATGTGCGA 7620 

GGGGGGCGAC GGGGGCGAGG TCATGGCCGT CTCGGACCTG CGCAGGGGCG GGCGCCTCAG 7680 

CTTGGCCGCA GGGCCGGGGG CCTCGGGGGA CGAGCGGCGA CGAGACGAGC GGCTCACTCG 7740 

CCATCGGGAC AGTCCCGCGC GAAGCCGCTC CCGGAAGCTG GATCGGCGGC GGGACCCGGG 7800 

10 GCGGGCTCCG GAGACGGCGC CGTCTCGGGG GGAGGGGCCG CTTGGGCGTC CGGACGCCCG 7860 

GCGGCTGAGG GAGTGTATGT AGGACGCGAG CCAGGCCTTG AAGGAGCGTC GGTGTGCACC 7920 

TTGGGGGCTG ATGTCAGCTG CCACATGACT AGCAGGTCGC TGTCGCCCGG ACTCATCCAT 7980 

CCGTCCGCCA GGTCGCCGTC CCCCCACAGA GACGCGTTCG CCGCGGCCTC TTCGAGCTGG 8040 

TCCTCCTGGT CCGCAAGACG ATCGTCCGCC GCGTCCAGGC GCTCGCTAAG CGCGGGATCG 8100 

15 AGGTACCGTC GGTGTGCGGT TAGAAAATCA CGTCGCGCCG CTTGCTCTTC CACGCGAATT 8160 

TTAACACAGG TCGCTCGCTG TCGCATCATC TCTAAGCGCG CGCGGGACTT TAGCCGCGCC 8220 

TCCAATTCCA AGTGGGCCGC CTTGGCGGCC ATAAAGGCGC CAACAAACCT AGGATCTTGT 8280 

GTACTCACGC CCTCCCGGTG TAGCTGCAGG GTCTGGTCCC TGTACACCTC GGCCCGGAGG 8340 

TGCGTCTCGG CCAAACGTCG GCGCAGGGCC GCGTGGCTGG CGTCTCGGCT CATCTCGCCG 8400 

20 CCCCCGCGCG CGCCCGACGT CGGACTCCTT CGCCCCGACC CCCCTGACCT CAGCCGCCCC 8460 

CGCCTCGCCC GCGATGTTTG GCCAGCAGCT GGCGTCCGAC GTGCAGCAGT ACCTGGAGCG 8520 

CCTGGAGAAA CAGAGGCAAC AGAAGGTGGG CGTCGACGAG GCGTCGGCGG GCCTGACGCT 8580 

CGGCGGCGAT GCGCTGCGCG TCCCTTTTTT GGATTTTGCC ACCGCGACGC CCAAGCGCCA 8640 

CCAGACCGTG GTCCCGGGCG TCGGGACGCT CCACGACTGC TGCGAGCACT CGCCGCTCTT 8700 

25 CTCGGCCGTC GCGCGGCGGT TGCTGTTTAA- TAGCCTGGTG CCGGCGCAAC TCAGGGGGCG 8760 

TGACTTTGGG GGCGACCACA CGGCCAAGCT GGAGTTCCTG GCCCCCGAGC TGGTGCGGGC 8820 

GGTGGCGCGC CTGCGGTTTC GGGAGTGCGC GCCGGAGGAC GCCGTGCCCC AACGCAACGC 8880 

CTACTACAGC GTCCTGAACA CGTTTCAGGC CCTGCACCGC TCCGAAGCCT TTCGGCAGTT 8940 

GGTTCACTTC GTGCGGGACT TCGCCCAGTT GTTGAAAACC TCGTTCCGGG CCTCTAGTCT 9000 

30 CGCGGAGACT ACGGGCCCCC CGAAGAAACG GGCCAAGGTG GACGTGGCCA CCCACGGGCA 9060 

GACGTACGGC ACCTTGGAGC TCTTCCAGAA AATGATACTA ATGCACGCGA CCTACTTTCT 9120 

GGCCGCCGTG CTGCTCGGGG ACCACGCGGA GCAGGTCAAC ACGTTCGTGC GGCTCGTGTT 9180 

CGAGATCCCC CTGTTTAGCG ACACGGCCGT GCGGCACTTC CGCCAGCGCG CCACCGTGTT 9240 

TCTAGTCCCC AGGCGCCACG GAAAGACCTG GTTTTTGGTG CCCCTCATCG CGCTGTCGCT 9300 

35 CGCGTCCTTC CGGGGGATCA AGATAGGCTA CACGGCCCAC ATCCGCAAGG CGACCGAGCC 9360 

CGTGTTTGAT GAGATCGACG CCTGCCTGCG GGGCTGGTTT GGCTCGTCCC GGGTGGACCA 9420 

CGTCAAGGGG GAAACCATCT CGTTCTCGTT CCCGGACGGC TCGCGCAGCA CGATCGTGTT 9480 

TGCCTCCAGC CACAACACGA ACGTAAGTAC GCCTTCCTCC CGCGGTGCCT GTTTCCCCGG 9540 

TGCCGCCCTC CCCGAGATCG ACCGACAGAC AAACACAGCC AGACGCGAGT GTGGGACGAC 9600 

40 ACGCCCGCAG CCCCCCCCGC CATGGCGGGG GGAAGCCTTA CTGTTTATTT GTAATCGGAC 9660 

GATGAGGCTC TGGCCACGGC CCGCGCGACC GCGGGGCAGC TCGTTGCAAA CAGGCGGCTG 9720 

GTATACGATG ACAGAACGCA GAGGCGCCAC CCGGCGCTGG TCGGGCGGAT GACGCTTTCC 9780 

GCGCCGTCCC GGCCCACGAC GACCTCGTGC AGGTGGGCCG TGATGCGCGG GCGGCGGGTC 9840 
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10 



15 



20 



25 



30 



GCCTGCCGCA 
TCCCCCCGGA 
GGACAGGCCT 
GTCGGGGCGC 
AAATGGCGGG 
CAGTATATGC 
CCCGGATTCG 
GGCTCGTCCG 
TCCAGGTGGC 
AGGTGGACGG 
TTCGAAGGCA 
GCACGCTCGG 
AGGCACCGAA 
TTGGCGATGC 
GGGGCGCCCC 
CGCCAGAGTG 
CAGAGGGGCC 
CGCGTCAGCG 
CACGCGCGAC 
CTCCCGGAAA 
AGCTCTCTCC 
CCATGAAGCG 
TTGGAGGGTT 
GGGCGTCGGA 
GCCGCCCCCG 
GAAAGAACGC 
GGAAGCGCCG 
GGCGCTGGAA 
GCAGACACGC 
GCAGCACGGG 
CCGTCTGGGC 
CCGTC 



GGATAACCGC 
CGGCCAGGGT 
CCGCCACGGC 
AGGCGCCGCC 
CCCGGCTGCC 
GGCCGCCGGG 
GGGGCGGTTC 
CGAGCTGTTC 
CGAAGGCTAG 
TGGCGCAGTA 
GGCGCAGCGC 
GGTCGGGGTC 
AAATCTTAAG 
GCGCGACGAG 
GGTGACCCGG 
CCCGCTGTGC 
AGCGCGTGTA 
GACAAGGCCG 
GGGGTCTGCG 
GACCACTAGC 
CGGAAGGTAT 
ACATTTACTC 
GGTAAGGCGC 
GTCGGCGAAG 
ACCGCCCTGG 
GTCGGGGGCG 
CGTCACCTCC 
GATGCGCGGG 
GATCTGGATG 
GCGCGGGTGG 
CGCGGGGGAC 



GTCCACGGGG 
GCGCTGGGCC 
GGGGGCGCGC 
TCCTCCCGGG 
CGGGGGACAG 
GAGGTGCCAC 
CGTGGGTACC 
GGCGGCGGGG 
GGTGCACAGC 
GCGGCGCTCG 
CTTGGGCCGC 
CGGAAGGGCC 
CCGCTCGTCC 
GTCGGCTTCG 
CGGGGCCGCG 
CAGGTTGGTG 
TGAGTTGGGG 
GGTCCCGTAG 
AGGCTGAGGT 
CCGCAGAGGC 
TGCTCGCAGA 
TGCTCGCGTC 
AGCGTGTCTC 
CGGGCGGGGA 
CCGCCCAGAT 
GTCCCCTCGA 
CCTAGCCACT 
CCGCCTTGGA 
CGCGCGACGA 
CGCAGGGGTC 
AGCTGGTGGG 



TGCCCGAAGA 
ATATTGGACC 
CACAGCGCGT 
GGGTCGGTAA 
AGCGACCCCA 
CAGGCCCCCG 
AGGTAGGCGC 
TCGGGGGTTT 
AGCGGGGTCC 
CGGTTAAAGA 
GTCAGGTACA 
ACCTGGCACA 
CCCCGAACGA 
GGCCCCGGGT 
GCTCCCGGGG 
GTGGGGAAGG 
GGGGGGTGGG 
CCGCCCCGCG 
ACGCCGCGGT 
GGCGATTGAA 
CCCTGTGCGG 
CATTGACGTC 
CGCTGGTGCT 
TGTCGTCGCT 
GCGCCAGCAC 
GGGCGCGCAT 
CGCTCTGGTG 
GCGCGGCCCG 
AGGCCACCTC 
CCTCGAGCGC 
GGCGCACGAC 



GGAGCTGACA 
ACATGCACGG 
TGGCGGAATC 
TCCTGGATAG 
GGTCATCATC 
GACCCAGGGC 
CGTCGAGCTC 
CCTCCGGGGG 
GGGGG TGCGT 
AGAAAATGGC 
GGAAGATCTC 
GCGGCTCGGT 
CGCGCCACAC 
CGGGGGCGCG 
GGCCTGGCGT 
GACCGGAGAC 
TGAGCGGTGG 
ACAGAACCGG 
GTTAATGGTA 
CCCAAGGCAG 
GGCAGTGGAG 
ACCGTCAATC 
GTAGTAGTCA 
GAGAGGGACG 
GGCCAGGGCG 
CAGGTTCTCC 
GGGGCCAAAG 
GATAGAGTGG 
GGCCGCGATG 
GGGAAAGCGA 
GCGCTCGGCG 



CAGGCTCGCG 
GGCGACGCAG 
GATGTGGGCC 
CAGCCATCCT 
CATGGCCCAG 
ACAGCACGCC 
GTGGGCCACG 
GGAGGCAGCT 
TACGCTGCGG 
AAAGAACGTG 
GCAGAAAAGG 
GAGGACCGTG 
GAAGACAGAG 
CGCGTCGGGG 
CGCCTGGGGA 
GCACCAAAAG 
AACAAAAGCA 
AGTCCGACGG 
AACGCAAAGC 
AGGTACGCGT 
GGGCTGCCCT 
ACCACTGCGA 
AACGCGTAGT 
AGCCGCCGCC 
TACGCGGTGT 
AGGAGCACGG 
TCGTAGCGCA 
CCCAGGGCCC 
TCAAAGGGCT 
CGCAGCAGCG 
GCACAGGCCT 



9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11706 



35 



40 



(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Met Glu Ala Pro Gly lie VaLTrp Val Glu Glu Ser Val Ser Ala lie 
15 10 15 

5 Thr Leu Tyr Ala Val Trp Leu Pro Pro Arg Thr Arg Asp Cys Leu His 
20 25 30 

Ala Leu Leu Tyr Leu Val Cys Arg Asp Ala Ala Gly Glu Ala Arg Ala 

35 40 45 

Arg Phe Ala Glu Val Ser Val Gly Ser Ser Xaa Xaa Xaa Xaa Xaa 
10 50 55 60 

(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 857 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Met Ala Glu Thr Met Asn Val Ala Thr Cys Thr His Gin Thr His His 
25 1 5 10 15 

Ala Ala Arg Ala Pro Gly Ala Thr Ser Ala Pro Gly Ala Ala Ser Gly 

20 25 30 

Asp Pro Leu Gly Ala Arg Arg Pro lie Gly Asp Asp Glu Cys Glu Gin 
35 40 45 

30 Tyr Thr Ser Ser Val Ser Leu Ala Arg Met Leu Tyr Gly Gly Asp Leu 
50 55 60 

Ala Glu Trp Val Pro Arg Val His Pro Lys Thr Thr lie Glu Arg Gin 
65 70 75 80 

Gin His Gly Pro Val Thr Phe Pro Asp Ala Ser Ala Pro Thr Ala Arg 
35 85 90 95 

Cys Val Thr Val Val Arg Ala Pro Met Gly Ser Gly Lys Thr Thr Ala 

100 105 110 

Leu lie Arg Trp Leu Gly Glu Ala lie His Ser Pro Asp Thr Ser Val 
115 120 125 

40 Leu Val Val Ser Cys Arg Arg Ser Phe Thr Gin Thr Leu Ala Thr Arg 
130 135 140 

Phe Ala Glu Ser Gly Leu Pro Asp Phe Val Thr Tyr Phe Ser Ser Thr 
145 150 155 160 
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Asn Tyr lie Met Asn Asp Arg Pro Phe His Arg Leu lie Val Gin Val 

165 170 175 

Glu Ser Leu His Arg Val Gly- Pro Asn Leu Leu Asn Asn Tyr Asp Val 
180 185 190 

5 Leu Val Leu Asp Glu Val Met Ser Thr Leu Gly Gin Lys Pro Thr Met 
195 200 205 

Gin Gin Leu Gly Arg Val Asp Ala Leu Met Leu Arg Leu Leu Arg Thr 

210 215 220 

Cys Pro Arg He He Ala Met Asp Ala Thr Ala Asn Ala Gin Leu Val 
10 225 230 235 240 

Asp Phe Leu Cys Ser Leu Arg Gly Glu Lys Asn Val His Val Val He 

245 250 255 

Gly Glu Tyr Ala Met Pro Gly Phe Ser Ala Arg Arg Cys Leu Phe Leu 
260 265 270 

15 Pro Arg Leu Gly Pro Glu Val Leu Gin Ala Ala Leu Arg Pro Pro Gly 
275 280 285 

Pro Ala Gly Gly Ala Pro Pro Pro Asp Ala Pro Pro Asp Ala Thr Phe 

290 295 300 

Phe Gly Glu Leu Glu Ala Arg Leu Ala Gly Gly Asp Asn Val Cys He 
20 305 310 315 320 

Phe Ser Ser Thr Val Ser Phe Ala Glu Val Val Ala Arg Phe Cys Arg 

325 330 335 

Gin Phe Thr Asp Arg Val Leu Leu Leu His Ser Leu Thr Pro Pro Gly 
340 345 350 

25 Asp Val Thr Thr Trp Gly Arg Tyr Arg Val Val lie Tyr Thr Thr Val 
355 360 365 

Val Thr Val Gly Leu Ser Phe Asp Pro Pro His Phe Asp Ser Met Phe 

370 375 380 

Ala Tyr Val Lys Pro Met Asn Tyr Gly Pro Asp Met Val Ser Val Tyr 
30 385 390 395 400 

Gin Ser Leu Gly Arg Val Arg Thr Leu Arg Lys Gly Glu Leu Leu lie 

405 410 415 

Tyr Met Asp Gly Ser Gly Ala Arg Ser Glu Pro Val Phe Thr Pro Met 
420 425 430 

35 Leu Leu Asn His Val Val Ser Ala Ser Gly Gin Trp Pro Ala Gin Phe 
435 440 445 

Ser Gin Val Thr Asn Leu Leu Cys Arg Arg Phe Lys Gly Arg Cys Asp 

450 455 460 

Ala Ser His Ala Asp Ala Ala Gin Arg Ser Arg lie Tyr Ser Lys Phe 
40 465 470 475 480 

Arg Tyr Lys His Tyr Phe Glu Arg Cys Thr Leu Ala Cys Leu Ala Asp 

485 490 495 

Ser Leu Asn He Leu His Met Leu Leu Thr Leu Asn Cys Met His Val 
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500 505 . 510 

Arg Phe Trp Gly His Asp Ala Ala Leu Thr Pro Arg Asn Phe Cys Leu 

515 -520 525 

Phe Leu Arg Gly lie His Phe Asp Ala Leu Arg Ala Gin Arg Asp Leu 
5 530 535 540 

Arg Glu Leu Arg Cys Gin Asp Pro Asp Thr Ser Leu Ser Ala Gin Ala 
545 550 555 560 

Ala Glu Thr Glu Glu Val Gly Leu Phe Val Glu Lys Tyr Leu Arg Pro 
565 570 575 

10 Asp Val Ala Pro Ala Glu Val Val Met Arg Gin Ser Leu Val Gly Arg 
580 585 590 

Thr Arg Phe lie Tyr Leu Val Leu Leu Glu Ala Cys Leu Arg Val Pro 

595 600 605 

Met Ala Ala His Ser Ser Ala lie Phe Arg Arg Leu Tyr Asp His Tyr 
15 610 615 620 

Ala Thr Gly Val lie Pro Thr lie Asn Ala Ala Gly Glu Leu Glu Leu 
625 630 635 640 

Val His Pro Thr Leu Asn Val Ala Pro Val Trp Glu Leu Phe Arg Leu 
645 650 655 

20 Cys Ser Thr Met Ala Ala Cys Leu Gin Trp Asp Ser Met Ala Gly Gly 
660 665 670 

Ser Gly Arg Thr Phe Ser Pro Glu Asp Val Leu Glu Leu Leu Asn Pro 

675 680 685 

His Tyr Asp Arg Tyr Met Gin Leu Val Phe Glu Leu Gly His Cys Asn 
25 690 695 700 

Val Thr Asp Gly Pro Leu Leu Ser Glu Asp Ala Val Lys Arg Val Ala 
705 710 . 715 720 

Asp Ala Leu Ser Gly Cys Pro Pro Arg Gly Ser Val Ser Glu Thr Glu 
725 730 735 

30 His Ala Leu Ser Leu Phe Lys lie lie Trp Gly Glu Leu Phe Gly Val 
740 745 750 

Gin Leu Ala Lys Ser Thr Gin Thr Phe Pro Gly Ala Gly Arg Val Lys 

755 760 765 

Asn Leu Thr Lys Arg Ala lie Val Glu Leu Leu Asp Ala His Arg lie 
35 770 775 780 

Asp His Ser Ala Cys Arg Thr Gin Leu Tyr Ala Leu Leu Met Ala His 
785 790 795 800 

Lys Arg Glu Phe Ala Gly Ala Arg Phe Lys Leu Arg Ala Pro Ala Trp 
805 810 . 815 

40 Gly Arg Cys Leu Arg Thr His Ala Ser Gly Ala Gin Pro Asn Thr Asp 
820 825 830 

He He Ala Ala Leu Ser Glu Leu Pro Thr Glu Ala Trp Pro Met Met 
835 840 845 
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Gin Gly Ala Val Asn Phe Ser Thr Leu 
850 855 

(2) INFORMATION FOR SEQ ID NO: 158: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

15 

Val Tyr Cys Ser His Ser Ser Ser Pro Met Gly Arg Arg Ala Pro Arg 

15 10 15 

Gly Ser Pro Glu Ala Ala Pro Gly Ala Asp Val Ala Pro Gly Ala Arg 
20 25 30 

20 Ala Ala Trp Trp Val Trp Cys Val Gin Val Ala Thr Phe lie Val Ser 
35 40 45 

Ala He Cys Val Val Gly Leu Leu Val Leu Ala Ser Val Phe Arg Asp 

50 55 60 

Arg Phe Pro Cys Leu Tyr Ala Pro Ala Thr Ser Tyr Ala Glu Ala Asn 
25 65 70 75 80 

Ala Thr Val Glu Val Arg Gly Gly Val Ala Val Pro Leu Arg Leu Asp 

85 90 f 95 

Thr Gin Ser Leu Leu Ala Thr Tyr Ala He Thr Ser Thr Leu Leu Leu 
100 105 110 

30 Ala Ala Ala Val Tyr Ala Ala Val Gly Ala Val Thr Ser Arg Tyr Glu 
115 120 125 

Arg Ala Leu Asp Ala Ala Arg Arg Leu Ala Ala Ala Arg Met Ala Met 

130 135 140 

Pro His Ala Thr Leu He Ala Gly Asn Val Cys Ala Trp Leu Leu Gin 
35 145 150 155 160 

He Thr Val Leu Leu Leu Ala His Arg lie Ser Gin Leu Ala His Leu 

165 170 175 

He Tyr Val Leu His Phe Ala Cys Leu Val Tyr Leu Ala Ala His Phe 
180 185 190 

40 Cys Thr Arg Gly Val Leu Ser Gly Thr Tyr Leu Arg Gin Val His Gly 
195 200 205 

Leu He Asp Pro Ala Pro Thr His His Arg He Val Gly Pro Val Arg 
210 215 220 
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Ala Val Met Thr Asn Ala Leu Leu Leu Gly Thr Leu Leu Cys Thr Ala 
225 230 235 240 

Ala Ala Ala Val Ser Leu Asn -Thr lie Ala Ala Leu Asn Phe Asn Phe 
245 250 255 

5 Ser Ala Pro Ser Met Leu lie Cys Leu Thr Thr Leu Phe Ala Leu Leu 
260 265 270 

Val Val Ser Leu Leu Leu Val Val Glu Gly Val Leu Cys His Tyr Val 

275 280 285 

Arg Val Leu Val Gly Pro His Leu Gly Ala lie Ala Ala Thr Gly lie 
10 290 295 300 

Val Gly Leu Ala Cys Glu His Tyr His Thr Gly Gly Tyr Tyr Val Val 
305 • 310 315 320 

.Glu Gin Gin Trp Pro Gly Ala Gin Thr Gly Val Arg Val Val Ala Ala 
325 330 335 

15 Phe Ala Met Ala Val Leu Arg Cys Thr Arg Ala Tyr Leu Tyr His Arg 
340 345 350 

Arg His His Thr Lys Phe Phe Val Arg Met Arg Asp Thr Arg His Arg 

355 360 365 

Ala His Ser Ala Leu Arg Arg Val Arg Ser Ser Met Arg Gly Ser Arg 
20 370 375 380 

Arg Gly Gly Pro Pro Gly Asp Pro Gly Tyr Ala Glu Thr Pro Tyr Ala 
385 390 395 400 

Ser Val Ser His His Ala Glu lie Asp Arg Tyr Gly Asp Ser Asp Gly 
405 410 415 

25 Asp Pro lie Tyr Asp Glu Val Ala Pro Asp His Glu Ala Glu Leu Tyr 
420 425 430 

Ala Arg Val Gin Arg Pro Gly Pro Val Pro Asp Ala Glu Pro lie Tyr 

435 440 445 

Asp Thr Val Glu Gly Tyr Ala Pro Arg Ser Ala Gly Glu Pro Val Tyr 
30 450 455 460 

Ser Thr Val Arg Arg Trp 
465 470 



35 



(2) INFORMATION FOR SEQ ID NO: 159: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:159: 

Met Gly Leu Ala Phe Ser Gly-Ala Arg Pro Cys Cys Cys Arg His Asn 
15 10 15 

5 Val He lie Thr Asp Gly Gly Glu Val Val Ser Leu Thr Ala His Glu 
20 25 30 

Phe Asp Val Val Asp He Glu Ser Glu Glu Glu Gly Asn Phe Tyr Val 

35 40 45 

Pro Pro Asp Met Arg Val Val Thr Arg Ala Pro Gly Pro Gin Tyr Arg 
10 50 55 60 

Arg Ala Ser Asp Pro Pro Ser Arg His Thr Arg Arg Arg Asp Pro Asp 
65 70 75 80 

Val Ala Arg Pro Pro Ala Thr Leu Thr Pro Pro Leu Ser Asp Ser Glu 
85 90 95 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 160: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 618 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 



Met Ala Ala Ala Ala Thr Pro Gly Ala Lys Arg Pro Ala Asp Pro Ala 
15 10 15 

30 Arg Asp Pro Asp Ser Pro Pro Lys Arg Pro Arg Pro Asn Ser Leu Asp 
20 25 30 

Leu Ala Thr Val Phe Gly Pro Arg Pro Ala Pro Pro Arg Pro Thr Ser 

35 40 45 

Pro Gly Ala Pro Gly Ser His Trp Pro Gin Ser Pro Pro Arg Gly Gin 
35 50 55 60 

Pro Asp Gly Gly Ala Pro Gly Glu Lys Ala Arg Pro Asp Ala Leu Ser 
65 70 75 80 

Glu Ala Ser Ser Gly Pro Pro Thr Pro Asp lie Pro Leu Ser Pro Gly 
85 90 95 

40 Gly Ala His Ala lie Asp Pro Asp Cys Ser Pro Gly Pro Pro Asp Pro 
100 105 110 

Asp Pro Met Trp Ser Ala Ser Ala He Pro Asn Ala Leu Pro Pro His 
115 120 125 

450 
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lie Leu Ala Glu Thr Phe Glu Arg His Leu Arg Gly Leu Leu Arg Gly 

130 135 140 

Val Arg Ser Pro Leu Ala lie -Gly Pro Leu Trp Ala Arg Leu Asp Tyr 
145 150 155 160 

5 Leu Cys Ser Leu Val Val Ser Leu Glu Ala Ala Gly Met Val Asp Arg 

165 170 175 

Gly Leu Gly Arg His Leu Trp Arg Leu Thr Arg Arg Ala Pro Pro Ser 

180 185 190 

Ala Ala Glu Ala Val Ala Pro Arg Pro Leu Met Gly Phe Tyr Glu Ala 
10 195 200 205 

Ala Thr Gin Asn Gin Ala Asp Cys Gin Leu Trp Ala Leu Leu Arg Arg 

210 215 220 

Gly Leu Thr Thr Ala Ser Thr Leu Arg Trp Gly Ala Gin Gly Pro Cys 
225 230 235 240 

15 Phe Ser Ser Gin Trp Leu Thr His Asn Ala Ser Leu Arg Leu Asp Ala 

245 250 255 

Gin Ser Ser Ala Val Met Phe Gly Arg Val Asn Glu Pro Thr Ala Arg 

260 265 270 

Asn Leu Leu Phe Arg Tyr Cys Val Gly Arg Ala Asp Ala Gly Val Asn 
20 275 280 285 

Asp Asp Ala Asp Ala Gly Arg Phe Val Phe His Gin Pro Gly Asp Leu 

290 295 300 

Ala Glu Glu Asn Val His Ala Cys Gly Val Leu Met Asp Gly His Thr 
305 310 315 320 

25 Gly Met Val Gly Ala Ser Leu Asp lie Leu Val Cys Pro Arg Asp Pro 

325 330 335 

His Gly Tyr Leu Ala Pro Ala Pro Gin Thr Pro Leu Ala Phe Tyr Glu 

340 345 350 

Val Lys Cys Arg Ala Lys Tyr Ala Phe Asp Pro Ala Asp Pro Gly Ala 
30 355 360 365 

Pro Ala Ala Ser Ala Tyr Glu Asp Leu Met Ala Arg Arg Ser Pro Glu 

370 375 380 

Ala Phe Arg Ala Phe lie Arg Ser He Pro Asn Pro Gly Val Arg Tyr 
385 390 395 400 

35 Phe Ala Pro Gly Arg Val Pro Gly Pro Glu Glu Ala Leu Val Thr Gin 

405 410 415 

Asp Arg Asp Trp Leu Asp Ser Arg Ala Ala Gly Glu Lys Arg Arg Cys 

420 425 430 

Ser Ala Pro Asp Arg Ala Leu Val Glu Leu Asn Ser Gly Val Val Ser 
40 435 440 445 

Glu Val Leu Leu Phe Gly Val Pro Asp Leu Glu Arg Arg Thr He Ser 

450 455 460 

Pro Val Ala Trp Ser Ser Gly Glu Leu Val Arg Arg Glu Pro He Phe 
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465 470 475 480 

Ala Asn Pro Arg His Pro Asn Phe Lys Gin lie Leu Val Gin Gly Tyr 

485 490 495 

Val Leu Asp Ser His Phe Pro Asp Cys Pro Leu Gin Pro His Leu Val 
5 500 505 510 

Thr Phe Leu Gly Arg His Arg Ala Gly Ala Glu Glu Gly Val Thr Phe 

515 520 525 

Arg Leu Glu Asp Gly Arg Gly Ala Pro Ala Gly Arg Gly Gly Ala Pro 
530 535 540 

10 Gly Pro Ala Lys Ala Ser lie Leu Pro Asp Gin Ala Val Pro lie Ala 
545 550 555 560 

Leu lie lie Thr Pro Val Arg Val Glu Pro Gly lie Tyr Arg Asp lie 

565 570 575 

Arg Arg Asn Ser Arg Leu Ala Phe Asp Asp Thr Leu Ala Lys Leu Trp 
15 580 585 590 

Ala Ser Arg Ser Pro Gly Arg Gly Pro Ala Ala Ala Asp Thr Thr Ser 

595 600 605 

Ser Ser Pro Thr Ala Gly Arg Ser Ser Arg 
610 615 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 161: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 525 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 



Val Gly Gly Arg Arg Pro Gly Gly Arg Met Asp Glu Ser Gly Arg Gin 
15 10 15 

35 Arg Pro Ala Ser His Val Ala Ala Asp lie Ser Pro Gin Gly Ala His 
20 25 30 

Arg Arg Ser Phe Lys Ala Trp Leu Ala Ser Tyr lie His Ser Leu Ser 

35 40 45 

Arg Arg Ala Ser Gly Arg Pro Ser Gly Pro Ser Pro Arg Asp Gly Ala 
40 50 55 60 

Val Ser Gly Ala Arg Pro Gly Ser Arg Arg Arg Ser Ser Phe Arg Glu 
65 70 75 80 

Arg Leu Arg Ala Gly Leu Ser Arg Trp Arg Val Ser Arg Ser Ser Arg 

452 



WO 98/20016 



PCT/US97/20016 



85 90 95 

Arg Arg Ser Ser Pro Glu Ala Pro Gly Pro Ala Ala Lys Leu Arg Arg 

100 - 105 110 

Pro Pro Leu Arg Arg Ser Glu Thr Ala Met Thr Ser Pro Pro Ser Pro 
5 115 120 125 

Pro Ser His lie Leu Ser Leu Ala Arg lie His Lys Leu Cys He Pro 

130 135 140 

Val Phe Ala Val Asn Pro Ala Leu Arg Tyr Thr Thr Leu Glu He Pro 
145 150 155 160 

10 Gly Ala Arg Ser Phe Gly Gly Ser Gly Gly Tyr Gly Glu Val Gin Leu 

165 170 175 

He Arg Glu His Lys Leu Ala Val Lys Thr He Arg Glu Lys Glu Trp 

180 185 190 

Phe Ala Val Glu Leu Val Ala Thr Leu Leu Val Gly Glu Cys Ala Leu 
15 195 200 205 

Arg Gly Gly Arg Thr His Asp He Arg Gly Phe He Thr Pro Leu Gly 

210 215 220 

Phe Ser Leu Gin Gin Arg Gin He Val Phe Pro Ala Tyr Asp Met Asp 
225 230 235 240 

20 Leu Gly Lys Tyr He Gly Gin Leu Ala Ser Leu Arg Ala Thr Thr Pro 

245 250 255 

Ser Val Ala Thr Ala Leu His His Cys Phe Thr Asp Leu Ala Arg Ala 

260 265 270 

Val Val Phe Leu Asn Thr Arg Cys Gly He Ser His Leu Asp He Lys 
25 275 280 285 

Cys Ala Asn Val Leu Val Met Leu Arg Ser Asp Ala Val Ser Leu Arg 

290 295 300 

Arg Ala Val Leu Ala Asp Phe Ser Leu Val Thr Leu Asn Ser Asn Ser 
305 310 315 320 

30 Thr He Ser Arg Gly Gin Phe Cys Leu Gin Glu Pro Asp Leu Glu Ser 

325 330 335 

Pro Arg Gly Phe Gly Met Pro Ala Ala Leu Thr Thr Ala Asn Phe His 

340 345 350 

Thr Leu Val Gly His Gly Tyr Asn Gin Pro Pro Glu Leu Leu Val Lys 
35 355 360 365 

Tyr Leu Asn Asn Glu Arg Ala Glu Phe Asn Asn Arg Pro Leu Lys His 

370 375 380 

Asp Val Gly Leu Ala Val Asp Leu Tyr Ala Leu Gly Gin Thr Leu Leu 
385 390 395 400 

40 Glu Leu Leu Val Ser Val Tyr Val Ala . Pro Ser Leu Gly Val Pro Val 

405 410 415 

Thr Arg Val Pro Gly Tyr Gin Tyr Phe Asn Asn Gin Leu Ser Pro Asp 
420 425 430 
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Phe Ala Val Leu Ala Tyr Arg Cys Val Leu His Pro Ala Leu Phe Val 

435 440 445 

Asn Ser Ala Glu Thr Asn Thr 'His Gly Leu Ala Tyr Asp Val Pro Glu 
450 455 460 

5 Gly lie Arg Arg His Leu Arg Asn Pro Lys lie Arg Arg Ala Phe Thr 

465 470 475 480 

Glu Gin Cys He Asn Tyr Gin Arg Thr His Lys Ala Val Leu Ser Ser 

485 490 495 

Val Ser Leu Pro Pro Glu Leu Arg Pro Leu Leu Val Leu Val Ser Arg 
10 500 505 510 

Leu Cys His Ala Asn Pro Ala Ala Arg His Ser Leu Ser 
515 520 525 



15 



(2) INFORMATION FOR SEQ ID NO: 162: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

25 

Met Ser Arg Asp Ala Ser His Ala Ala Leu Arg Arg Arg Leu Ala Glu 

15 10 15 

Thr His Leu Arg Ala Glu Val Tyr Arg Asp Gin Thr Leu Gin Leu His 
20 25 30 

30 Arg Glu Gly Val Ser Thr Gin Asp Pro Arg Phe Val Gly Ala Phe Met 
35 40 45 

Ala Ala Lys Ala Ala His Leu Glu Leu Glu Ala Arg Leu Lys Ser Arg 

50 55 60 

Ala Arg Leu Glu Met Met Arg Gin Arg Ala Thr Cys Val Lys He Arg 
35 65 70 75 80 

Val Glu Glu Gin Ala Ala Arg Arg Asp Phe Leu Thr Ala His Arg Arg 

85 90 95 

Tyr Leu Asp Pro Ala Leu Ser Leu Asp Ala Ala Asp Asp Arg Leu Ala 
100 105 . 110 

40 Asp Gin Glu Glu Gin Leu Glu Glu Ala Ala Ala Asn Ala Ser Leu Trp 
115 120 125 

Gly Asp Gly Asp Leu Ala Asp Gly Trp Met Ser Pro Gly Asp Ser Asp 
130 135 140 
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Leu Leu Val Met Trp Gin Leu Thr Ser Ala Pro Lys Val His Thr Asp 
145 150 155 160 

Ala Pro Ser Arg Pro Gly Ser- Arg Pro Thr Tyr Thr Pro Ser Ala Ala 
165 170 175 

5 Gly Arg Pro Asp Ala Gin Ala Ala Pro Pro Pro Glu Thr Ala Pro Ser 
180 185 190 

Pro Glu Pro Ala Pro Gly Pro Ala Ala Asp Pro Ala Ser Gly Ser Gly 

195 200 205 

Phe Ala Arg Asp Cys Pro Asp Gly Glu 
10 210 215 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 239 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163 : 

Pro Ala Asp Leu Glu Pro Leu Gly Asp Pro Thr Leu Trp Arg Ala Leu 
25 1 5 10 15 

Tyr Ala Cys Val Leu Ala Ala Leu Glu Arg Gin Thr Gly Pro Val Phe 

20 25 30 

Val Pro Leu Arg Leu Gly Trp Asp Pro Gin Thr Gly Leu Val Val Arg 
35 40 45 

30 Val Glu Arg Ala Ser Trp Gly Pro Pro Ala Ala Pro Arg Ala Ala Leu 
50 55 60 

Leu Asp Val Glu Ala Lys Val Asp Val Asp Pro Leu Ala Ala Arg Val 
65 70 75 80 

Ala Glu His Pro Gly Ala Arg Leu Ala Trp Ala Arg Leu Ala Ala lie 
35 85 90 95 

Arg Asp Ser Pro Gin Cys Ala Ser Ser Ala Ser Leu Ala Val Thr lie 

100 105 110 

Thr Thr Arg Thr Ala Arg Phe Ala Arg Glu Tyr Thr Thr Leu Ala Phe 
115 120 125 

40 Pro Pro Thr Ser Lys Glu Gly Ala Phe Ala Asp Leu Val Glu Val Cys 
130 • 135 140 

Glu Val Gly Leu Arg Pro Arg Gly His Pro Gin Arg Val Thr Ala Arg 
145 150 155 160 
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Val Leu Leu Pro Arg Gly Tyr Asp Tyr Phe Val Ser Ala Gly Asp Gly 

165 170 175 

Phe Ser Ala Pro Ala Leu Val Phe Arg Gin Trp His Thr Thr Val His 
180 185 190 . 

5 Ala Ala Pro Gly Ala Pro Val Phe Ala Phe Leu Gly Ala Gly Phe Asp 
195 200 . 205 

Val Arg Gly Gly Pro Val Gin Tyr Phe Ala Val Leu Gly Phe Pro Gly 

210 215 220 

Trp Pro Thr Phe Thr Val Pro Ala Ala Ala Xaa Xaa Xaa Xaa Xaa 
10 225 230 235 

(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 315 amino acids. 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Val Trp Arg Val Val Arg Gly Asp Glu Arg Leu Lys lie Phe Arg Cys 
25 1 5 10 15 

Leu Thr Val Leu Thr Glu Pro Leu Cys Gin Val Pro Asp Pro Asp Pro 

20 25 30 

Glu Arg Ala Leu Phe Cys Glu lie Phe Leu Tyr Leu Trp Lys Ala Leu 
35 40 45 

30 Arg Leu Pro Ser Asn Thr Phe Phe Ala lie Phe Phe Phe Asn Arg Glu 
50 55 60 

Arg Arg Tyr Cys Ala Thr Val His Leu Arg Ser Val Thr His Pro Arg 
65 70 75 80 

Thr Pro Leu Leu Cys Thr Leu Ala Phe Gly His Leu Glu Ala Asp Pro 
35 85 90 95. 

Glu Glu Thr Pro Asp Pro Ala Ala Glu Gin Leu Ala Asp Glu Pro Val 

100 105 110 

Ala His Glu Leu Asp Gly Ala Tyr Leu Val Pro Thr Glu Pro Pro Pro 
115 120 125 

40 Asn Pro Gly Ala Cys Cys Ala Leu Gly Pro Gly Ala Trp Trp His Leu 
130 135 140 

Pro Gly Gly Arg lie Tyr Cys Trp Ala Met Asp Asp Asp Leu Gly Ser 
145 150 155 160 
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Leu Cys Pro Pro Gly Ser Arg Ala Arg His Leu Gly Trp Leu Leu Ser 

165 170 175 

Arg lie Thr Asp Pro Pro Gly -Gly Gly Gly Ala Cys Ala Pro Thr Ala 
180 185 190 

5 His lie Asp Ser Ala Asn Ala Leu Trp Arg Ala Pro Ala Val Ala Glu 
195 200 205 

Ala Cys Pro Cys Val Ala Pro Cys Met Trp Ser Asn Met Ala Gin Arg 

210 215 220 

Thr Leu Ala Val Arg Gly Asp Ala Ser Leu Cys Gin Leu Leu Phe Gly 
10 225 230 235 240 

His Pro Val Asp Ala Val He Leu Arg Gin Ala Thr Arg Arg Pro Arg 

245 250 255 

He Thr Ala His Leu His Glu Val Val Val Gly Arg Asp Gly Ala Glu 
260 265 270 

15 Ser Val He Arg Pro Thr Ser Ala Gly Trp Arg Leu Cys Val Leu Ser 
275 280 285 

Ser Tyr Thr Ser Arg Leu Phe Ala Thr Ser Cys Pro Ala Val Ala Arg 

290 295 300 

Ala Val Ala Arg Ala Ser Ser Ser Asp Tyr Lys 
20 305 310 315 

(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 278 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

Leu Thr Glu Ala Cys Ala Ala Glu Arg Val Val Arg Pro His Gin Leu 
35 1 5 10 15 

Ser Pro Ala Ala Gin Thr Ala Leu Leu Arg Arg Phe Pro Ala Leu Glu 

20 25 30 

Gly Pro Leu Arg His Pro. Arg Pro Val Leu Gin Pro Phe Asp He Ala 
35 40 45 

40 Ala Glu Val Ala Phe Val Ala Arg He Gin He Ala Cys Leu Arg Ala 
50 55 60 

Leu Gly His Ser He Arg Ala Ala Leu Gin Gly Gly Pro Arg He Phe 
65 70 75 80 
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Gin Arg Leu Arg Tyr Asp Phe Gly Pro His Gin Ser Glu Trp Leu Gly 

85 90 95 

Glu Val Thr Arg Arg Phe Pro- Val Leu Leu Glu Asn Leu Met Arg Ala 
100 105 110 

5 Leu Glu Gly Thr Ala Pro Asp Ala Phe Phe His Thr Ala Tyr Ala Val 
115 120 125 

Leu Ala His Leu Gly Gly Gin Gly Gly Arg Gly Arg Arg Arg Arg Leu 

130 135 140 

Val Pro Leu Ser Asp Asp lie Pro Ala Arg Phe Ala Asp Ser Asp Ala 
10 145 150 155 160 

His Tyr Ala Phe Asp Tyr Tyr Ser Thr Ser Gly Asp Thr Leu Arg Leu 

165 170 175 

Thr Asn Arg Pro lie Ala Val Val He Asp Gly Asp Val Asn Gly Arg 
180 185 190 

15 Glu Gin Ser Lys Cys Arg Phe Met Glu Gly Ser Pro Ser Thr Ala Pro 
195 200 205 

His Arg Val Cys Glu Gin Tyr Leu Pro Gly Glu Ser Tyr Ala Tyr Leu 

210 215 220 

Cys Leu Gly Phe Asn Arg Arg Leu Cys Gly Leu Val Val Phe Pro Gly 
20 225 230 235 240 

Gly Phe Ala Phe Thr He Asn Thr Ala Ala Tyr Leu Ser Leu Ala Asp 

245 250 255 

Pro Val Ala Arg Ala Val Gly Leu Arg Phe Cys Arg Gly Ala Ala Thr 
260 265 270 

25 Gly Pro Gly Leu Val Arg 
275 

(2) INFORMATION FOR SEQ ID NO: 166: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 731 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166 : 

GCGGCGGCCG GCACGGTAAA CGTGGGCCAG CCCGGAAATC CCAGCACGGC AAAGTATTGG 60 

40 ACGGGCCCTC CCCGGACGTC AAACCCGGCC CCCAGAAAAG CGAAGACGGG GGCCAGGGCT 120 

CCGGGGGCGG CGTGGACCGT GGTATGCCAC TGCCGGAAGA GGGCGACCAG CGCCGGGGCG 180 

GAGAACCCGT CGCCGGCGCT CACGAAGTAG TCGTAGCCGC GCGGCAGCAG CACCCGCGCC 240 

GTGACCCGCT GCGGGTGTCC GCGGGGCCGC AGGCCGACCT CGCACACCTC GACCAGGTCC 300 
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GCGAAGGCGC 


CCTCCTTGCT 


GGTCGGCGGA 


AACGCCAGGG 


TGGTGTATTC 


GCGCGCGAAA 


360 


CGCGCGGTCC 


TCGTCGTGAT 


GGTGACGGCG 


AGCGAGGCGG 


AGGACGCGCA 


CTGGGGGCTG 


420 


TCGCGAATGG 


CGGCCAGGCG 


CGCCCACGCC 


AACCGCGCGC 


CGGGGTGCTC 


GGCGACGCGC 


480 


GCGGCCAGGG 


CCAGCGGGTC 


GACGTCGACC 


TTGGCCTCCA 


CGTCCAGGAG 


GGCGGCGCGA 


540 


GGAGCGGCCG 


GCGGGCCCCA 


CGACGCCCTT 


TCGACCCTCA 


CGACCAGACC 


CGTCTGCGGG 


600 


TCCCAGCCCA 


GGCGCAGCGG 


GACGAAGAGG 


GCCACCGGCC 


CCGTCTGGCG 


CTCCAGGGCC 


660 


GCCAGAACGC 


ACGCATACAG 


CGCCCGCCAC 


AGGGTCGGGT 


CCCCCAGGGG 


CTCCAGCGGG 


720 


GAGGCGGCCG 












731 



10 (2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



20 



Pro Ala Asp Leu Glu Pro Leu Gly Asp Pro Thr Leu Trp Arg Ala Leu 

1 5 10 15 

Tyr Ala Cys Val Leu Ala Ala Leu Glu Arg Gin Thr Gly Pro Val Phe 
20 25 30 

25 Val Pro Leu Arg Leu Gly Trp Asp Pro Gin Thr Gly Leu Val Val Arg 
35 40 45 

Val Glu Arg Ala Ser Trp Gly Pro Pro Ala Ala Pro Arg Ala Ala Leu 

50 55 60 

Leu Asp Val Glu Ala Lys Val Asp Val Asp Pro Leu Ala Ala Arg Val 
30 65 70 75 80 

Ala Glu His Pro Gly Ala Arg Leu Ala Trp Ala Arg Leu Ala Ala lie 

85 90 95 

Arg Asp Ser Pro Gin Cys Ala Ser Ser Ala Ser Leu Ala Val Thr. He 
100 105 HO 

35 Thr Thr Arg Thr Ala Arg Phe Ala Arg Glu Tyr Thr Thr Leu Ala Phe 
115 120 125 

Pro Pro Thr Ser Lys Glu Gly Ala Phe Ala Asp Leu Val Glu Val Cys 

130 135 140 

Glu Val Gly Leu Arg Pro Arg Gly His Pro Gin Arg Val Thr Ala Arg 
40 145 150 155 160 

Val Leu Leu Pro Arg Gly Tyr Asp Tyr Phe Val Ser Ala Gly Asp Gly 

165 170 175 

Phe Ser Ala Pro Ala Leu Val Phe Arg Gin Trp His Thr Thr Val His 
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10 



180 185 190 

Ala Ala Pro Gly Ala Pro Val Phe Ala Phe Leu Gly Ala Gly Phe Asp 

195 ,.200 205 

Val Arg Gly Gly Pro Val Gin Tyr Phe Ala Val Leu Gly Phe Pro Gly 

210 215 220 

Trp Pro Thr Phe Thr Val Pro Ala Ala Ala Xaa Xaa Xaa Xaa Xaa 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 168 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3005 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

20 GCGCTGGAGC GGGAGCAGCG CGCGGCGGAC CGGGCGGCCG GGGGAGGCGC GGGCCGCCCG 60 

GCGGAGGCGG ATCTTCTCCG GGCCGACTAC GACATTATCG ACGTCAGCAA GTCCATGGAC 120 

GACGACACGT ACGTGGCCAA CAGTTTCCAG CACCAGTACA TCCCCGCGTA CGGCCAGGAC 180 

CTCGAGCGCC TGTCGCGCCT CTGGGAGCAC GAGCTGGTGC GCTGCTTCAA GATTCTGCGC 240 

CACCGCAACA ACCAGGGCCA GGAAACGTCG ATCTCGTACT CTAGCGGGGC GATCGCCTCC 300 

25 TTCGTGGCCC CGTATTTCGA GTACGTGCTT CGCGCCCCCC GAGCGGGCGC GCTCATCACC 360 

GGCTCCGATG TCATCCTAGG GGAGGAGGAG TTATGGGAGG CGGTCTTTAA GAAAACCCGC 420 

CTGCAGACGT ACCTGACAGA CGTCGCGGCC CTGTTCGTGG CGGACGTACA GCACGCGGCT 480 

CTGCCCCGGC CCCCCTCCCC AACCCCCGCC GATTTCCGGG CGAGCGCGTC CCCGCGGGGC 540 

GGGTCCCGGT CCCGGACCCG GACCCGATCC CGGTCGCCCG GGAGAACGCC GAGGGGTGCG 600 

30 CCGGACCAGG GCTGGGGCGT CGAACGCAGG GATGGCCGAC CCCACGCCCG CCGATGAGGG 660 

AACGGCCGCC GCCATCCTCA AACAGGCCAT CGCCGGGGAC CGCAGTCTGG TCGAGGTGGC 720 

GGAGGGGATC AGCAACCAGG CGCTGCTGCG CATGGCCTGC GAGGTGCGCC AGGTCAGCGA 780 

TCGCCAGCCG CGGTTTACCG CGACCAGCGT CCTGCGCGTT GACGTCACCC CCAGGGGGCG 840 

GTTGCGGTTC GTTCTGGACG GGAGTTCCGA CGACGCGTAC GTGGCGTCGG AGGATTACTT 900 

35 TAAGCGCTGC GGGGACCAGC CGACGTATCG CGGTTTTGCG GTCGTCGTCC TCACGGCCAA 960 

CGAGGACCAC GTGCACAGCC TGGCCGTGCC CCCCCTCGTT CTGCTGCACC GGCTCTCCTT 1020 

GTTTCGCCCC ACGGACCTCC GGGACTTCGA GCTCGTCTGC CTGCTGATGT ACCTGGAGAA 1080 

CTGTCCCCGG AGCCACGCCA CGCCCTCGCT GTTCGTCAAG GTGTCGGCGT GGTTGGGGGT 1140 

CGTGGCCCGC CACGCGTCTC CCTTCGAGCG CGTCCGCTGC CTTCTCCTCC GCAGCTGCCA 1200 

40 CTGGATCCTG AACACGCTAA TGTGCATGGC GGGCGTGAAG CCCTTCGACG ACGAGCTAGT 1260 

CCTGCCCCAC TGGTACATGG CCCACTACCT GCTGGCCAAC AATCCGCCCC CCGTCCTCTC 1320 

GGCCCTGTTT TGCGCCACCC CGCAGAGCTC TGCGTTGCAG TTGCCCGGGC CCGTCCCCCG 1380 

CACGGACTGT GTGGCCTATA ACCCGGCCGG CGTCATGGGA AGCTGCTGGA AATCCAAGGA 1440 
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10 



15 



20 



25 



30 



40 



CCTGCGTTCG 


GCTCTGGTGT 


ATTGGTGGC T 


TTCGGGGAGC 


CCCAAACGAC 


GGACCTCGTC 


1500 


GCTTTTCTAT 


CGGTTTTGCT 


AACTCCGGAA 


AATAAACGTG 


TTTTTTATGG 


AACGTTCCCT 


1560 


ACCTGTCGTG 


TCATCTCTCG 


GGGGATGGTG 


GTGGGCCTGT 


GTGTGTGTCT 


TGTGCACCGA 


1620 


AGGAGGAAAG 


TGGGGGGGTG 


GTGGTGCTGG 


TGGTGGAAAG 


ACATGATAGA 


GGGAACAAAG 


1680 


AAATAGAAGA 


AAACCACAAC 


CGGCGCGTGT 


CAGTAAATAC 


GGACGCGCGC 


ACACGCGGGG 


1740 


GTAAGTTGGA 


GCACGGGGCC 


CCAGTTTATT 


GACCAAATTC 


AGGGAAACAG 


AAACCGCATC 


1800 


TTTTCCTCGA 


AAGGGTACAC 


AAAGCTCCCG 


CCCTCGCCCC 


ACACGCCTTC 


CAGAACCCCC 


1860 


GTAAACACCA 


GTTGAATCTC 


GCGCAGGATC 


TCGCGCAGGT 


GATGGGCGCA 


GTCCACGGGG ■ 


1920 


GGGAGCACCA 


AGGGCCGCGG 


GTACAGATCC 


ACGGGGACGC 


CGACCGACTC 


CCCGCCCCCG 


1980 


GGACATACGC 


GCACGACGCG 


TCTCCAGTAT 


TGCTCCGCGT 


CCAGCAGGGC 


GCCTCCGCGG 


2040 


AAGGCCGTTT 


GGGGCAGGGG 


GTCGTCGGCC 


TCGCCCGGGG 


GGGTCAGAAC 


GCTCCAGTAC 


2100 


TCCGCGTCCA 


GACGCCTCCC 


GAAGGCATCC 


AGGACAAAGC 


GGTCACAGGC 


GTCCTCCATG 


2160 


ATGCCCCGGG 


CCGCGCACAC 


GGCCTCCTCC 


GGCGGGCCGG 


CGGCCGGCCG 


CCGGAGGATT 


2220 


CGTCTCAGCG 


CGTCGCGCAT 


AACCTCGGCC 


GCCGCGGCGT 


ACGCGGGCCC 


GCGGAGAGGA 


2280 


AATCCCTGCA 


GGAAGTCGGT 


GTCATCGCGG 


GAGTTCCAGA 


ACCACGCCCC 


GGTCTGGCTC 


2340 


CAGGTGACGA 


CGTGGGTGTA 


GACGCCCTCT 


AGCGCCAGGG 


AGGGGGCGAG 


GCGCGGGCGT 


2400 


ATGCCGTTGG 


CCGAAAGTAC 


GGCGCGCACG 


GACGCCTCGA 


GGGCCCGGCG 


GGCGTCCTGG 


2460 


ATCGCGCCGT 


GCGCGGCGTC 


CGCGTCCCCG 


GGGTCCACGT 


TGAACAGCCC 


CCAGAACGCA 


2520 


GCCCCGGTGC 


CGCCGCAGAC 


CGCAAACTTC 


ACCGAGCTGG 


CCGTCTGCTC 


GATCTGCAGG 


2580 


CAGACGGCGG 


CCATGACCCC 


GCCGAGCAGC 


TGCCGGAGCG 


CGGGGCAGGC 


GTCGCACGCG 


2640 


TCCGGCACCA 


GGCGCTCCAG 


CACGGCCCGG 


GCCCAGGGCT 


CCGAGGGGGC 


GGCCGCCACC 


2700 


AGCGCGTCCA 


GCCTTTCCAG 


GCCCGCCCGC 


CCCCGGGCTT 


CCGGCAGCCC 


GGCCTCCCCG 


2760 


AGGCCCGCGA 


GGGCGGCCAG 


GAGCTGGGCC 


TGGAGCCCGG 


AGAAACAAAA 


CCGCGCCGTC 


2820 


CAGACCGGCC 


CGACGGCCGC 


CGGGGGGTCG 


AGTAGTTGGA 


TGGTGGTGGC 


CGTGGGGTGC 


2880 


CACCGCGCCA 


CCGCTTCCCG 


AAAGGCGGGC 


AGGAGGCGGC 


CGGCCGCCTC 


CGAGGCCACG 


2940 


GCCGGCCATG 


CCCGCGGGGG 


CAGGACGACC 


CTGGCGCCCA 


CCGCGGGCCA 


GGCCCCCAGG 


3000 


CACG 












3005 



(2) INFORMATION FOR SEQ ID NO: 169: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



Xaa Xaa Xaa Xaa Xaa Ala Leu Glu Arg Glu Gin Arg Ala, Ala Asp Arg 

15 10 15 

Ala Ala Gly Gly Gly Ala Gly Arg Pro Ala Glu Ala Asp Leu Leu Arg 
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20 25 30 

Ala Asp Tyr Asp lie lie Asp Val Ser Lys Ser Met Asp Asp Asp Thr 

35 -40 45 

Tyr Val Ala Asn Ser Phe Gin His Gin Tyr lie Pro Ala Tyr Gly Gin 
5 50 55 60 

Asp Leu Glu Arg Leu Ser Arg Leu Trp Glu His Glu Leu Val Arg Cys 
65 70 75 80 

Phe Lys lie Leu Arg His Arg Asn Asn Gin Gly Gin Glu Thr Ser lie 
85 90 95 

10 Ser Tyr Ser Ser Gly Ala lie Ala Ser Phe Val Ala Pro Tyr Phe Glu 
100 105 110 

Tyr Val Leu Arg Ala Pro Arg Ala Gly Ala Leu He Thr Gly Ser Asp 

115 120 125 

Val He Leu Gly Glu Glu Glu Leu Trp Glu Ala Val Phe Lys Lys Thr 
15 130 135 140 

Arg Leu Gin Thr Tyr Leu Thr Asp. Val Ala Ala Leu Phe Val Ala Asp 
145 . 150 155 160 

Val Gin His Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr Pro Ala Asp 
165 170 175 

20 Phe Arg Ala Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr Arg Thr Arg 
180 185 190 

Ser Arg Ser Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp Gin Gly Trp 

195 200 205 

Gly Val Glu Arg Arg Asp Gly Arg Pro His Ala Arg Arg 
25 210 215 220 

(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 302 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Val Arg Arg Thr Arg. Ala Gly Asn Ala Gly Met Ala Asp Pro Thr Pro 
40 1 5 10 15 

Ala Asp Glu Gly Thr . Ala Ala Ala He Leu Lys Gin Ala He Ala Gly 

20 25 30 

Asp Arg Ser Leu Val Glu Val Ala Glu Gly He Ser Asn Gin Ala Leu 
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35 40 45 

Leu Arg Met Ala Cys Glu Val Arg Gin Val Ser Asp Arg Gin Pro Arg 

50 55 60 

Phe Thr Ala Thr Ser Val Leu Arg Val Asp Val Thr Pro Arg Gly Arg 
5 65 70 75 80 

Leu Arg Phe Val Leu Asp Gly Ser Ser Asp Asp Ala Tyr Val Ala Ser 

85 90 95 

Glu Asp Tyr Phe Lys Arg Cys Gly Asp Gin Pro Tyr Gly Phe Ala Val 
100 105 110 

10 Val Val Leu Thr Ala Asn Glu Asp His Val His Ser Leu Ala Val Pro 
. 115 120 125 

Pro Leu Val Leu Leu His Arg Leu Ser Leu Phe Arg Pro Thr Asp Leu 

130 135 140 

Arg Asp Phe Glu Leu Val Cys Leu Leu Met Tyr Leu Glu Asn Cys Pro 
15 145 150 155 160 

Arg Ser His Ala Thr Pro Ser Leu Phe Val Lys Val Ser Ala Trp Leu 

165 170 175 

Gly Val Val Ala Arg His Asp Phe Glu Arg Val Arg Cys Leu Leu Leu 
180 185 190 

20 Arg Ser Cys His Trp He Leu Asn Thr Leu Met Cys Met Ala Gly Val 
195 200 205 

Lys Pro Phe Asp Asp Glu Leu Val Leu Pro His Trp Tyr Met Ala His 

210 215 220 

Tyr Leu Leu Ala Asn Asn Pro Pro Pro Val Leu Ser Ala Leu Phe Cys 
25 225 230 235 240 

Ala Thr Pro Gin Ser Ser Ala Leu Gin Leu Pro Gly Pro Val Pro Arg 

245 250 255 

Thr Asp Cys Val Ala Tyr Asn Pro Ala Gly Val Met Gly Ser Cys Trp 
260 265 270 

30 Lys Ser Lys Asp Leu Arg Ser Ala Leu Val Tyr Trp Trp Leu Ser Gly 
275 280 285 

Ser Pro Lys Arg Arg Thr Ser Ser Leu Phe Tyr Arg Phe Cys 
290 295 300 

35 (2) INFORMATION FOR SEQ ID N0:171: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 402 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

Ala Cys Leu Gly Ala Trp Pro Ala Val Gly Ala Arg Val Val Leu Pro 
5 1 5 10 15 

Pro Arg Ala Trp Pro Ala Val Ala Ser Glu Ala Ala Gly Arg Leu Leu 

20 25 30 

Pro Ala Phe Arg Glu Ala Val Ala Arg Trp His Pro Thr Ala Thr Thr 
35 40 45 

10 lie Gin Leu Leu Asp Pro Pro Ala Ala Val Gly Pro Val Trp Thr Ala 
50 55 60 

Arg Phe Cys Phe Ser Gly Leu Gin Ala Gin Leu Leu Ala Ala Gly Leu 
65 70 75 80 

Gly Glu Ala Gly Leu Pro Glu Arg Arg Ala Gly Leu Glu Arg Leu Asp 
15 85 90 95 

Ala Leu Val Ala Ala Ala Pro Ser Glu Pro Trp Ala Arg Ala Val Leu 

100 105 110 

Glu Arg Leu Val Pro Asp Ala Cys Asp Ala Cys Pro Ala Leu Arg Gin 
115 120 125 

20 Leu Leu Gly Gly Val Met Ala Ala Val Cys Leu Gin lie Glu Gin Thr 
130 135 140 

Ala Ser Ser Val Lys Phe Ala Val Cys Gly Gly Thr Gly Ala Ala Phe 
145 150 155 160 

Trp Gly Leu Phe Asn Val Asp Pro Gly Asp Ala Asp Ala Ala His Gly 
25 165 170 175 

Ala lie Gin Asp Ala Arg Arg Ala Leu Glu Ala Ser Val Arg Ala Val 

180 185 190 

Leu Ser Ala Asn Gly lie Arg Pro Arg Leu Ala Pro Ser Leu Ala Leu 
195 200 205 

30 Glu Gly Val Tyr Thr His Val Val Thr Trp Ser Gin Thr Gly Ala Trp 
210 215 220 

Phe Trp Asn Ser Arg Asp Asp Thr Asp Phe Leu Gin Gly Phe Pro Leu 
225 230 235 240 

Arg Gly Pro Ala Tyr Ala Ala Ala Ala Glu Val Met Arg Asp Ala Leu 
35 245 250 255 

Arg Arg He Leu Arg Arg Pro Ala Ala Gly Pro Pro Glu Glu Ala Val 

260 265 270 

Cys Ala Arg He Met Glu Asp Ala Cys Asp Arg Phe Val Leu Asp Ala 
275 280 285 

40 Phe Gly Arg Arg Leu Asp Ala Glu Tyr Trp Ser Val Leu Thr Pro Pro 
290 295 300 

Gly Glu Ala Asp Asp Pro Leu Pro Gin Thr Ala Phe Arg Gly Gly Ala 
305 310 315 320 
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Leu Leu Asp Ala Glu Gin Tyr Trp Arg Arg Val Val Arg Val Cys Pro 

325 330 335 

Gly Gly Gly Giu Ser Val Gly -Val Pro Val Asp Leu Tyr Pro Arg Pro 
340 345 350 

5 Leu Val Leu Pro Pro Val Asp Cys Ala His His Leu Arg Glu lie Leu 
355 360 365 

Arg Glu lie Gin Leu Val Phe Thr Gly Val Leu Glu Gly Val Trp Gly 

370 375 380 

Glu Gly Gly Ser Phe Val Tyr Pro Phe Glu Glu Lys Met Arg Phe Leu 
10 385 390 395 400 

Phe Pro 



(2) INFORMATION FOR SEQ ID NO: 172: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 



CGCGACGCGG 


GCCGCTGGGT 


CCGCGGACCG 


GAGAACGACG 


TCCGCGGTCC 


GCGGGGCGTA 


60 


CCCGGACCCC 


ATGGCCAGCC 


TGTCGCCGCG 


ACCCCCGGCG 


CCCCGCCGAC 


ACCACCACCA 


120 


CCACCACCGC 


CGCCGCCGCC 


GGCGCGCCCC 


CCGCCGGCGC 


TCGACCGCCT 


CTGACTCATC 


180 


AAAATCCGGA 


TCCTCGTCGT 


CGGCGTCCTC 


CGCCTCCTCC 


TCCGCCTCCT 


CCTCCTCGTC 


240 


TGCATCCGCC 


TCCTCGTCTG 


ACGACGACGA 


CGACGACGCC 


GCCCGCGCCC 


CCGCCAGCGC 


300 


CGCAGACCAC 


GCCGCGGGCG 


GGACCCTCGG 


CGCGGACGAC 


GAGGAGGCGG 


GGGTGCCCGC 


360 


GAGGGCCCCG 


GGGGCGGCGC 


CCCGGCCGAG 


CCCGCCCAGG 


GCCGAGCCCG 


CCCCCGGGGC 


420 


CGGGGCG 












428 



(2) INFORMATION FOR SEQ ID NO: 173: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:173: 
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CGCGCTCCGT GTGGACGATC GCCCCGTCGC CTGGCTGATA TAGTCCTCGG GGCGCGCGGG 60 

GCGGGGGGAA AGGAGGAGGA CGCGGAGGAG GAGCGATCGA CGCCGCCGCG CCCCGGCTCG 120 

CCGGGGTTCC GCCCCCAGGT GGAACCGCAT TATGCGCGGC CCCGCCCCGA CGCCCGCGCG 180 

5 TCCGCGTCCG TGGCGGCGGC CCGTTGGTCG CGCCGCCGCC GGCTCCGCCC GCGCGGCATC 240 

TCATTAGCGC CCGGCGCGGG CGGCTTCCGC TTCCGCCCGC GATGCTAATG AGACCCTCGT 300 

CGCGGGCGGG CTCGCTCCCC TGCCCTTCCG GGTTCGTGGT AATGAGATGC CGGCCCCGCG 360 

CTCCCGTTGG CCCCCGCCGG CCCCTTTGGG GCCGGCGAGG TCGCCCCGTT GGTCCGCGGG 420 

CGGCTCCGCC CCAAAGGGGG CGGGGCCGCA GGGTAAAAGA AGTGAGAACG CGAAGCGTTC 480 

10 GCACTTCGTC CTAATAGTAT ATATATTATT AGGGCAAAGT GCGAGCACTG GCGCCCTGCC 540 

CGGGGCCCGC GTCATCCCGC GGGCTCCGCC CCAAAGGGGG CGGGGCCGCA GGGTAAAAGA 600 

AGTGAGAACG CGAAGCGTTC GCACTTCGTC CTAATAGTAT ATATATTATT AGGGCAAAGT 660 

GCGAGCACTG GCGCCCTGCC CGGGGCCCGC GTCATCCCGC GGGCCCCGCC CCGAGGCGGG 720 

CCCGGACGGG GGGCGGGCCG TTCCTCGCGC ACATAAAGGG CCGGCGTCCC GGTCGCCGCC 780 

15 GCACCAGGGG CACACCGGCT GCGCGGCGGA GACCGGGACG GCAGCGGCGG CATCGCGAAG 840 

GGGGCCACAG CGAGACAGAG ACGCCGGCGG CGAGCGGGGC ACCGACGCAC CCGGATCGGA 900 

TCGGATACAG AGACGCGGGC GCATCGGTTC CTTTTCGTTC TGCCTTTCCC TCCCCCCCCC 960 

CCCCACCCTG TACGTACCGC GAGGACCCAT CCACCCACTG CAGCCTTATC GCAGGTACGG 1020 

TGACCCGGGG GCGCCGGGGC GGGGGGACGG GACGGGGGGA CGGGACGGGG GGACGGGACG 1080 

20 GGGGGACGGG ACGGGGGGAC GGGACGGGGG GACGGGACGG GGGGACGGGA CGGGGGGACG 1140 

GGACGGGGGG ACGGGACGGG GGGACGGGAC GGGGGGACGG GACGGGGGGA CGGGACGGGG 1200 

GGACGGGACG GGGGGACGGG ACGGGGGGAC GGGACGGGGG GACGGGACGG GGGGACGGGA 1260 

CGGGGGGACG GGACGGGGGG ACGGGACGGG GGGACGGGAC GGGGGGACGG GACGGGGGGA 1320 

CGGGACGGGG GGACGGGACG GGGGGACGGG ACGGGGGGAC GGGACGGGGG GACGGGACGG 1380 

25 GGGGACGGGA CGGGGGGACG GGACGGGGGG ACGGGACGGG GGGACGGGAC GGGGGGGCCC 1440 

CGATCCCAAC ATCCGCGCTT TCTCGCAGGC CGGGCGCCGC CTTCGTGGAC GGGACACCGG 1500 

TGTGGTAACT GGCGACAAGG CGTCGCCACT ATGGCAGACA TCCCCCCGGA CCCGCCCGCG 1560 

CTCAACACGA CGCCTGCGAA TCATGCTCCC CCATCCCCAC CCCCGGGTTC ACGGAAGCGC 1620 

AGACGCCCCG TCCTCCCCAG CTCGTCGGAA TCTGAGGGTA AGCCCGACAC AGAATCGGAA 1680 

30 TCCTCCTCGA CCGAGTCGTC CGAGGATGAG GCGGGAGACC TACGCGGCGG GCGCCGTCGC 1740 

TCCCCGCGGG AGCTCGGGGG GAGGTATTTT TTGGATCTGT CGGCAGAATC GACCACGGGG 1800 

ACGGAATCGG AGGGAACGGG GCCGTCGGAC GACGATGATG ATGATGCGTC AGACGGCTGG 1860 

TTGGTTGACA CCCCCCCCCG TAAATCCAAG CGACCCCGAA TCAACCTGCG ATTAACGAGC 1920 

TCCCCCGACC GGCGCGCGGG TGTGGTTTTC CCCGAGGTGT GGAGAAACGA CAGACCTATC 1980 

35 CGCGCGGCGC AACCCCAGGC CCCGGCCCAG TCTTCCGGGG ATCGCGCAGC CGCACCGCGG 2040 

CGCTCTGCTC GCCAGGCCCA GATGCGGAGC GGAGCCGCCT GGACGCTTGA TCTGCATTAC 2100 

ATACGCCAGT GCGTCAACCA GCTCTTTCGG ATCCTGCGTG CCGCCCCGAA CCCGCCCGGC 2160 

AGCGCCAACC GCCTGCGCCA CCTGGTGCGA GACTGCTACC TTATGGGCTA CTGCCGGACC 2220 

CGCCTGGGGC CGCGCACGTG GGGCCGCCTG CTGCAGATCT CGGGCGGAAC CTGGGACGTG 2280 

40 CGCCTGCGAA ACGCAATCCG GGAGGTCGAG GCGCGTTTTG AACCCGCCGC CGAGCCCGTG 2340 

TGCGAGCTGC CCTGTCTGAA CGCCAGGCGT TACGGCCCCG AGTGTGATGT TGGCAATCTC 2400 

GAGACCAACG GCGGCTCGAC GAGCGATGAT GAGATATCGG ATGCGACGGA CTCGGACGAT 2460 

ACCCTCGCGT CCCATTCCGA CACGGAGGGG GGGCCCTCCC CGGCCGGCCG GGAGAACCCG 2520 
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GAATCCGCGT CCGGCGGGGC TATCGCGGCT 
TGGACGTCCG AGGAGGGCTC CCAGCCCTGG 
GCCGAACGCT CTGGCCTACC CGCCC-GGGGC 
GAGGACGGGT GCCGAAAAAT GCGCTTCCCC 
5 TTTCTCCGGC CATGAGCGCG GGACCCCCAG 
CCCTACAAGA AAGCTTTTGT GTCTGAGTGT 
ACAAAAAAAG AAACAAACGC GACACCGCTC 
TAGCATCGGG GGGGGGTTAG AGGTTGGTGA 
ACTCGTCGCC AACGGCCAGC GGGGGCCCGG 

10 AGGAGGGCTC ATCGGGAATC TCGGGTCGCC 
GGTGGGTCGC CGGATGCGGG CGGGATGATG 
GGGACCGTAG CCAGCGAAGA CAGCTGCGTT 
GGTATTCGTA TCGGCTAAGG AGATTTTCCA 
TCCACGACAC GGTCCGCTCG GGCAAAAACC 

15 ACGCGTCGTT GGTCTTCATG GCGATGAAGC 
CTAAAAACGG CACACACAGG TCCGCCGCCC 
GGCCGGGGCG CTCTTGATGC AGGAGCCGAA 
TTTCCAGGGG GTCGTCCTCC GCAAACAGGC 
AGCGGGTTCG CACGATCGGT CGGGTGAATT 

20 TGCGAAGGTT CGCCGGGCGA ACCACCACCG 
GGCATTGCCG AAGCAGAAAA CTCCACAGAG 
GGGCGTTTCG TTGGTCTAGG AGGGTAACCA 
ATTAGGCCCG TGGTCCGATC TTCACTCACT 
CGGCGCGCGG ACCCGGGGGG GCTGCTTGCT 

25 TTCAGCCTTG TTTGGTTGGC TAGGTATCCC 
GGGGGTAGTG GGGGGGTGTG TCGACAAACT 
TCGGGGTGCT CGTTGGTTGG CACGCGCGAC 
GGTCTACCGT AGACCCGACA AGAGACAGGA 
CCCGGCCTTC CCGGCGAGCA CCTTTTATAC 

30 GCCCCCGGAA ACCATCCACC CTTCCCGCCC 
TCAGCTGCAG GAGATCTTGG CCCAGATGCA 
CGCGGGTGCG GAGGAGGAAG ACGAGGCCGA 
GGAGGACTAC GCGGAGGGGC GTTTTCTGTC 
CAGCGGCCAT CCTCCTGTTC CGGGCCGCGC 

35 CGGTAAGGTG GGGGCCACGG GGTTCACCCC 
ACTTCGGGCC ATCAGCCGCG GGTGCAAGCC 
GCTGGGATTC GCGATCCACG GAGCGCTCAT 
CAGCCACCCG AACTACCCTC ATCGGGTAAT 
CCACGAGGCG CGGCTGCTGA GACGCCTGAA 

40 GCACGTCGTT TCTGGGGTCA CGTGTCTGGT 
CTATCTGAGC AAGCGCCCGT CTCCGTTGGG 
GCTCTTGAGC GCCATCGACT ACGTCCACTG 
CGAGAACATC CTCATCAACA CCCCCGAGAA 



CGGCTGGAGT 


GTGAGTTTGG 


GACGTTTGAC 


2580 


CTGTCCGCGG 


TGGTCGCCGA 


TACCAGCTCC 


2640 


GCGTGTCGCG 


CAACGGAAGC 


CCCAGAACGC 


2700 


GCCGCCTGCC 


CCTATCCCTG 


CGGCCACACA 


2760 


CCCGGTGTGT 


TTGCCAAACG 


AAAATAAACG 


2820 


CTGGTTTTTC 


TGGGGGTGGA 


GGAAGGAACG 


2880 


GTACGTGTAA 


TGGGGCGCAG 


TGTTTTTTAT 


2940 


TTGGATAGCA 


AACGTGGGAT 


GACGGAGGCC 


3000 


GGTTCTGGGG 


GTCATCGTCC 


CCCGTCTGCC 


3060 


CCATGCACGT 


AAAACACGGG 


CGCTGCGTGG 


3120 


CGGGGCGGGG 


TTTGTTGTGA 


GGAGCCACGA 


3180 


CCCGGTCGCC 


GGGCACCACC 


ACGCCGTATT 


3240 


GGGGGTGATT 


AGGCGCTGCG GGGAACGGGG 


3300 


GATCGGGCAG 


GGGCCACGGT 


TCCCCCACCC 


3360 


GAAACCCCAG 


CCGGGTTTTT 


TGTGCGTACT 


3420 


CGACCACCCA 


CAGGTGGTAT 


AGCCGGTGGG 


3480 


AACACGCAGG 


GGCATCCAGA ATCTCGATGC 


3540 


CCGTCGTGGT 


GTTTGGGGGA 


CAGCGACAGG 


3600 


TGGGCAAGTC 


CATCAGAGGC 


TCGGCCAGCC 


3660 


GGGTTCCCAG 


AGGCTCGGAG 


GCCAGGATCC 


3720 


CCGGGCTTGC 


GTCAGCGGAA GTCCGCGGCA 


3780 


CACTTACAAC 


AACAACGCCC 


ATGTCGGTAT 


3840 


CGCCTGTCTG 


CGGACCTATG 


CACGGCGGGA 


3900 


ATCACACGGC 


CCGTTCGCAC 


GTTCGATTTT 


3960 


GGATAATCTG 


ACGTTCCGGA 


TATAGGGGGC 


4020 


GCCGCTTCTT 


AAAACACCGG 


GGCCCGTCGC 


4080 


GCGGCGAATG 


GCCTGTCGTA 


AGTTCTGTGG 


4140 


GGCGTCCGTC 


CCGCCGGAGA 


CAAACACGGC 


4200 


CCCCGCGGAG 


GATGCGTACC 


TGGCCCCCGG 


4260 


ACCGTCCCCC 


GGCGAGGCTG 


CGCGCCTGTG 


4320 


CAGCGACGAG 


GACTACCCCA 


TCGTGGACGC 


4380 


CGATGACGCC 


CCGGATGACG 


TGGCCTACCC 


4440 


CATGGTTTCG 


GCCGCCCCCC 


TGCCCGGAGC 


4500 


AGCCCCCCCC 


GACGTCCGGA 


CCTGCGACAG 


4560 


GGAAGAGCTC 


GACACCATGG ACCGGGAGGC 


4620 


CCCTTCGACC 


CTGGCAAAAC 


TGGTGACCGG 


4680 


CCCGGGGTCG 


GAGGGGTGTG 


TCTTTGATAG 


4740 


CGTCAAGGCG 


GGGTGGTACG 


CCAGCACGAA 


4800 


CCACCCCGCG 


ATCCTACCCC 


TCCTGGACCT 


4860 


CCTCCCCAAG 


TATCACTGCG 


ACCTGTATAC 


4920 


CCACCTACAG 


ATAACCGCGG 


TCTCCCGGCA 


4980. 


CGAAGGCATC 


ATCCACCGCG 


ATATTAAGAC 


5040 


CATCTGTCTG 


GGGGACTTTG 


GGGCGGCGTG 


5100 
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CTTTGTGCGC GGGTGTCGAT CGAGCCCCTT 
AAACGCCCCC GAGGTCCTGG CCGGGGATCC 
CGGCCTGGTG ATCTTTGAGA CCGCCGTCCA 
CCCCGAAAGG CGGCCGTGCG ACAACCAGAT 
5 CGTCGACGAG TTTCCAACGC ACGCGGAATC 
GGCCGGGAAC AATCGTCCGG CGTGGACCCG 
CACAGACGTC GAATATCTCA TCTGCAAAGC 
CGCCGCGGAG TTGCTGCGCC TGCCGCTATT 
GGCGTGGAGG GGGGGCTGGT TGGATGTTTT 

10 GTGTTTTTGG CACCTTGCCG CCCGGCGTCA 
TTTTTGTTCT TTCTGGTCTT CCGGGGACAC 
ATCCCCCCAA CAACGATGTT GTTTTCCCGG 
CCTATCCCCG GTTGGACGAT CCCGGGCCCT 
TGCCCCGGCG CGTCGTCCGT CACGAGCCCC 

15 TTTTGCTGGC GCCGCCGGTA CGCGGATTTG 
TGACGTACTA CCGGCTCACC CGCGCCTGCC 
GGTGTCGCGG CGGCGAGCCG CCGTCCCCAA 
AGGGCGGCGG GCCTCCGACC CGGTACGCTC 
GGGACCGCGC CGCGGAGACA TTCGAGTACC 

20 GTCTGTTGTG GGTAGAGGTG GGCGGGGAGG 
CGCGTGCGGA GGGCGGCCCG TGCGTCCCCC 
TGCCCCCGGT ATGGTATTCC GCCCCCAACC 
GCTGTCTGCC CCCACAGACG CCCGCCGCCC 
CCCAGAGCCT GCTGGTGGGG ATTACGGGCC 

25 AAGACGTCGG GGTCCTGCCA CCCCATTGGG 
CCCCCTTCCC ACCCCGCCCG CGGTTTCGAC 
ACCCCGACGT TCGGGCCCCC CTAACCGGGC 
CCTCCGATTC GCCTACGTCC GCTCCGGAGA 
TGGCGCCCTC AGTCGACCCA AGCGCGGAAC 

30 ACGAGATGGC CACACAAGCC GCAACGGTCG 
CCCCGCCCGC GACTGCATCC GTGGAGTCGT 
GGGCCGGGCA CACGAACACC AGCAGCGCCC 
CCCCCACGAC CCCCCCGCCC ACGTCTACCC 
CCCAAACAAC CCCTCCCGGA CCCGCAACCC 

35 CGGCCGATTC CCCCCTCACC GCCTCGCCCC 
ACGTTTCGGT CGCCGCGACC ACCGCCACGC 
CAACGGACCC AAAGACGCAC CCACACGGAC 
CCCCACCCCC CGAACATCGC GGCGGACCCG 
CCCCCGATGA CGACGACAGC GCCACCGGCC 

40 CACCCCCCGC GCGCCCCGGG CCCATCCGCC 
TCGCCCCCAA CACGCCTCGC CCCCCCGCCC 
CCACACCCCA ACACATCCCC CTGTTCTGGT 
TCTTTATCAT CAGCACCACC ATCCACACGG 
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CCATTACGGG 


ATCGCAGGCA 


CCATCGATAC 


5160 


GTACACCCAG 


GTAATCGACA 


TCTGGAGCGC 


5220 


CACCGCGTCC 


TTGTTCTCGG 


CCCCGCGCGA 


5280 


CGCGCGCATC 


ATCCGACAGG 


CCCAGGTACA 


5340 


GCGCCTCACC 


GCGCACTACC 


GCTCGCGGGC 


5400 


ACCGGCATGG 


ACCCGCTACT 


ACAAGATCCA 


5460 


CCTTACCTTT 


GACGCGGCGC 


TCCGCCCAAG 


5520 


TCACCCTAAG 


TGACCCCGCT 


CCCCCCGGGG 


5580 


TGCACAAAAA 


GACGCGGCCC 


TCGGGCTTTG 


5640 


TGCACGCCAT 


CGCTCCCAGG 


TTGCTTCTTC 


5700 


GCGGCGGGTC 


GGGTGTCCCC 


GGACCAATTA 


5760 


GAGGTTCCCC 


CGTGGCTCAA 


TATTGTTATG 


5820 


TGGGTTCCGC 


GGACGCCGGG 


CGGCAAGACC 


5880 


TGGGCCGCTC 


GTTCCTCACG 


GGGGGGCTGG 


5940 


GCGCACCCAA 


CGCAACGTAT 


GCGGCCCGTG 


6000 


GTCAGCCCAT 


CCTCCTTCGG 


CAGTATGGAG 


6060 


AGACGTGCGG 


GTCGTACACG 


TACACGTACC 


6120 


TCGTAAATGC 


TTCCCTGCTG 


GTGCCGATCT 


6180 


AGATCGAACT 


CGGCGGCGAG 


CTGCACGTGG 


6240 


GCCCCGCCCC 


CACCGCCCCC 


CCACAGGCGG 


6300 


CGGTCCCCGC 


GGGCCGCCCG 


TGGCGCTCGG 


6360 


CCGGGTTTCG 


TGGCCTGCGT 


TTCCGGGAGC 


6420 


CCAGCGACCT 


ACCACGCGTC 


GCTTTTGCTC 


6480 


GCACGTTTAT 


TCGGATGGCA 


CGACCCACGG 


6540 


CCCCCGGGGC 


CCTAGATGAC 


GGTCCGTACG 


.6600 


GCGCCCTGCG 


GACAGACCCC 


GAGGGGGTCG 


6660 


GGCGCCTCAT 


GGCCTTGACC 


GAGGACGCGT 


6720 


AGACGCCCCT 


CCCTGTGTCG 


GCCACCGCCA 


6780 


CGACCGCCCC 


CGCAACCACT 


ACTCCCCCCG 


6840 


CCGTTACGCC 


GGAGGAAACG 


GCAGTCGCCT 


6900 


CGCCACTCCC 


CGCCGCGGCG 


GCAACGCCCG 


6960 


CCGCAGCGAA 


AACGCCCCCC 


ACCACACCAG 


7020 


ACGCGACCCC 


CCGCCCCACG 


AGTCCGGGGC 


7080 


CGGGTCCGGT 


GGGCGCCTCC 


GCCGCACCCA 


7140 


CCGCTACCGC 


GCCGGGGCCC 


TCGGCCGCCA 


7200 


tCGGAACCCG 


GGGCACCGCC 


CGTACCCCCC 


7260 


CCGCGGACGC 


TCCCCCCGGC 


TCGCCAGCCC 


7320 


AGGAGTTTGA 


GGGCGCCGGG 


GACGGCGAAC 


7380 


TCGCCTTCCG 


AACTCCGAAC 


CCCAACAAAC 


7440 


CCACGCTCCC 


GCCAGGAATT 


CTTGGGCCGC 


7500 


AAGCTCCCGC 


TAAGGACATG 


CCCTCGGGCC 


7560 


TCCTAACGGC 


CTCCCCTGCT 


CTAGATATCC 


7620 


CGGCGTTCGT 


TTGTCTGGTC 


GCCTTGGCAG 


7680 



WO 98/20016 



PCIYUS97/20016 



CACAACTTTG GCGCGGCCGG GCGGGGCGCA 
TATGTCTGCC ACCCGAGCGG GATTAGGGGG 
GGAAAGGGAA CAGCGACCAA ATGCCACGAT 
TATGTGAGTT TGGTTGTGTT TTGTGGGACT 
5 GACAGAGTCT TTTAAAAGAC GTGTCCCGGG 
AGAGAAAGGC CCCCAGACGA AGTCACCCGG 
CCGCCGGGGG GCGTGCCTGT TGTTTTGGTC 
GGGATTGTGG GAATCCTCGG GTGTGCTGCT 
GATACAACAA ACGCGACCGC ACGCCTCCCC 

10 TTTGCCGTCC CCCTCATCGT GGGGGGGCTG 
CTCGAGGTCC TGCGTCGCCT GGGTCGCGAG 
TTTGCCCCAT GATTTTTCGC CTTTCTGGCC 
TCGGGTGCCC GGGGTACAGC AGCTATGGAG 
CACGCCCCGT GCCGGGCATG GGTTGTGCGG 

15 CGTACCAACT TGGGGGGGGG GGGAAAGAAA 
CACAAGGGGG GTTATGGCGG ACCCACCGCA 
ACCAAATCAC CCCCAGAGGG GAGGTTCCAT 
TTTGTGTTTA AAACCCGGGG TCGGTGTGGT 
AGTCGCCGTT TTTCGTGTGC ATCGCGTATC 

20 GGACGGCGGC CCTGCTAGTT GTCGCGGTGG 
TAGCAGACCC CTCGCTTAAG ATGGCCGATC 
TTTTGGACCA GCTGACCGAC CCCCCCGGGG 
TGGAGGACCC GTTCCAGCCC CCCAGCATCC 
GTGCCTGCCG CAGCGTGCTC CTACATGCCC 

25 CTTCGGACGA GGCCCGAAAG CACACGTACA 
ACAATTGCGC TATCCCCATC ACGGTTATGG 
TGGGGGTCTG. CCCCATCCGA ACGCAGCCCC 
TCAGCGAGGA TAACCTGGGA TTCCTGATGC 
ACCTGCGGCT AGTGAAGATA AACGACTGGA 

30 GGGCCCGCGC CTCCTGCAAG TACGCTCTCC 
CCTCGAAGGC CTACCAACAG GGCGTGACGG 
TCCCCGAAAA CCAGCGCACC GTCGCCCTAT 
CCAAGCCCCC GTACACCAGC ACCCTGCTGC 
CGCAACCCGA ACTCGTTCCG GAAGACCCCG 

35 GGACGGTGTC TTCGCAGATC CCCCCAAACT 
CGCACCACGC CCCCGCCGCC CCCAGCAACC 
GTACCCTGGC GGTGCTGGTC ATCGGCGGTA 
TGGCCCCCAA GCGCCTACGT CTCCCCCACA 
AGCCATTGTT TTACTAGAGG AGTATCCCCG 

40 GGTGGCTGGG GTATTTGGGT GGGACTTGGA 
AACTAGGACA GTTCATAGGC CGGGAGCGTG 
ACCGCGCCCA CAGTCACCTC GACCCGTCCG 
GCCTGGCGAT CCTGGGCCTG TGGGTCTGCG 



GGCGATACGC 


GCACCCGAGC 


GTGCGTTACG 


7740 


TGGGGTGGGG 


GCGAGAAACG 


ATGAAGGACG 


7800 


AAGAACAATA 


AACCTGTGAC 


GTCAATCGGA 


7860 


GGGGGCGGGG 


GGTGGGAGGT 


ATCAGTGGGT 


7920 


GCCCTCGAGA 


CGCGCAACTT 


TTGGCCACAC 


7980 


GTCCCCGAAC 


AAAAACAAAA 


ACCTTGACCG 


8040 


TCAATGGATC 


GGTATGCCGT 


TCGGACCTGG 


8100 


GTTGGGGCCG 


CACCCACCGG 


CCCCGCGTCC 


8160 


ACGCACCCCC 


CACTCATCCG 


TTCCGGGGGC 


8220 


TGTCTCATGA 


TTCTGGGGAT 


GGCGTGTCTA 


8280 


TTGGCGAGGT 


GCTGCCCCCA 


CGCGGGCCAA 


8340 


TTGCCCCCAC 


CCCATCGCCC 


CGATTGTGTG 


8400 


CGGTCGGTAA 


TATAACTTTG 


GTTGTCGCCA 


8460 


GAAAGACGAA 


ATAATCCGGC 


GATCCCCAAG 


8520 


CTAAAAACAC 


ATCAAGCCCA 


CAACCCATCC 


8580 


CCACCATACT 


CCGATTCGAC 


CACATATGCA 


8640 


TTTTACGAGG 


AGGAGGAGTA 


TAATAGAGTC 


8700 


GTTCGGTCAT 


AAGCTGCATT 


GCGAACGACT 


8760 


ACGGCATGGG 


GCGTTTGACC 


TCCGGCGTCG 


8820 


GACTCCGCGT 


CGTCTGCGCC 


AAATACGCCT 


8880 


CCAATCGATT 


TCGCGGGAAG 


AACCTTCCGG 


8940 


TGAAGCGTGT 


TT AC C AC ATT 


CAGCCGAGCC 


9000 


CGATCACTGT 


GTACTACGCA 


GTGCTGGAAC 


9060 


CATCGGAGGC 


CCCCCAGATC 


GTGCGCGGGG 


9120 


ACCTGACCAT 


CGCCTGGTAT 


CGCATGGGAG 


9180 


AATACACCGA 


GTGCCCCTAC 


AACAAGTCGT 


9240 


GCTGGAGCTA 


CTATGACAGC 


TTTAGCGCCG 


9300 


ACGCCCCCGC 


CTTCGAGACC 


GCGGGTACGT 


9360 


CGGAGATCAC 


ACAATTTATC 


CTGGAGCACC 


9420 


CCCTGCGCAT 


CCCCCCGGCA 


GCGTGCCTCA 


9480 


TCGACAGCAT 


CGGGATGCTC 


CCCCGCTTTA 


9540 


ACAGCTTAAA 


AATCGCCGGG 


TGGCACGGCC 


9600 


CGCCGGAGCT 


GTCCGACACC 


ACCAACGCCA 


9660 


AGGACTCGGC 


CCTCTTAGAG 


GATCCCGCCG 


9720 


GGCACATCCC 


GTCGATCCAG 


GACGTCGCGC 


9780 


CGGGCCTGAT 


CATCGGCGCG 


CTGGCCGGCA 


9840 


TTGCGTTTTG 


GGTACGCCGC 


CGCGCTCAGA 


9900 


TCCGGGATGA 


CGACGCGCCC 


CCCTCGCACC 


9960 


CTCCCGTGTA 


CCTCTGGGCC 


CGTGTGGGAG 


10020 


CTCCGCATAA 


AGGGAGTCTC 


GAAGGAGGGA 


10080 


GGGCGCGCAC 


CGCTGTCCCG 


ACGATTAGCC 


10140 


ATCCCGGTAT 


GCCCGGCCGC 


TCGCTGCAGG 


10200 


CCACCGGCCT 


GGTCGTCCGC 


GGCCCCACGG 


10260 
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TCAGTCTGGT CTCAGACTCA CTCGTGGATG 
AAGAGGACCT GCGTGTTTTC GGGGAGCTTC 
ACTACTACGA CGGCATCATC GAGCTGTTTC 
TTGTACACGT GGTCACACTG ACCGCATGCC 
5 GTCGCTCGAC GCACCACGCC CACAGCCCCG 
GGCAGCCGCT TCTGCGGGTT CGAACGGCAA 
GCGTATGGGT CGGCAGCGCG ACGAACGCCA 
CCAACGGSAC GTTTGTGTAT AACGGCTCGG 
CCTTTTCGGC CCCGCGCCTG GGACCCTCGA 

10 CCCCTCCACG GACAACGACA TCCCCGTCCT 
ACACAGGGAC GCCCGCGCCC GCGAGCGGCG 
CCAGCGAATC GAGACACAGG CTAACCGTAG 
CCATCATCGC CTTTGTGTTT CTGGGCAGCT 
GATACAGGCG CCCCCGCGGC CAGATTTACA 

15 AGGCGGCCAT GGCCCGCCTC GGAGCCGAGC 
CCCGACGCCG TTCGTCGTCG TCCACGACCA 
CGGAGCCAGG TCCAGTCGTG CTGCTGTCCG 
CCCCCCAAGA GGTCTAGGTC CAAGCGGGCC 
GTGGTTATTT CCCCCCCAAT AAACCGATGT 

20 TTGTGATCGT TCGTCATTCC CCGGATGGCA 
GGGGGGGGAG GAAAAAGAAT AAAGGGGGTA 
AGTCGCCGCC CCGACTCTGT GTCTTCGGGT 
GATCTGCCCC GACCGACGGC TCCTGCCACC 
TTTTTGTTGG AGTTTGGGTC GTATCGTGCC 

25 GGGTAACCTC GGGCGAGGAC GTGGTGTTGC 
CCCGGGCCCA CAAACTACTG TGGGCCGCGG 
CGTCGTGGGT GGCGCTGTGG CCGCCCCGAC 
GCATGCGCGC CCCGGAACCG CTCGCCATAG 
AGGGACTGTA TTCGGAGTTG GCGTGGCGCG 

30 TCATCTACGG GGCCCTGGAG ACGGACAGCG 
GCGACGAGGC GCGCCAAGTG GCGTCGGTGG 
CCCCGACCCC CGACGACTAC GACGAAGAAG 
TCAGCGTTCC CCCCCCAACC CCCCCCCGTC 
GTGTTATCCC CGAGGTGTCC CACGTGCGCG 

35 CCATTCTGTT TGCCCCCGGG GAGACGTTTG 
ACGACGACGG TCCGTACGCC ATGGACGTCG 
GCGCCGAGAT GCGGATCTAC GAAGCTTGTC 
CTCCGGCCGA CGCGCCGTGC GCCGTAAGTT 
ACGCCGGCTG TTCCAGGACT ACGCCCCCGC 

40 CGGTCCCGGG GTTGGCGTGG CTGGCCTCCA 
CCCAGCACGC CGGCCTCTAC CTGTGCGTGG 
GCCACATGAC CATCAGCACC GCGGCGCAGT 
CCCAGCGCCA GCCCGAGCCC GTCGAGCCCA 



CCGGGGCCGT 


GGGGCCCCAG 


GGCTTCGTGG 


10320 


ATTTTGTGGG 


GGCCCAGGTC 


CCCCACACAA 


10380 


ACTACCCCCT 


GGGG AACCAC 


TGCCCCCGCG 


10440 


CCCGCCGCCC 


CGCCGTGGCG 


TTCACCTTGT 


10500 


CCTATCCGAC 


CCTGGAGCTG 


GGTCTGGCGC 


10560 


CGCGCGACTA 


TGCCGGTCTG 


TATGTCCTGC 


10620 


GCCTGTTTGT 


TTTGGGGGTG 


GCGCTCTCTG 


10680 


ACTACGGCTC 


CTGCGATCCG 


GCGCAGCTTC 


10740 


GCGTATACAC 


CCCCGGAGCC 


TCCCGGCCCA 


10800 


CCCCCCGAGA 


CCCGACCCCC 


GCCCCCGGGG 


10860 


AGAGAGCCCC 


GCCCAATTCC 


ACGCGATCGG 


10920 


CCCAGGTAAT 


CCAGATCGCC 


ATACCGGCGT 


10980 


GTATCTGCTT 


CATCCATAGA 


TGCCAGCGCC 


11040 


ACCCCGGGGG 


CGTTTCCTGC 


GCGGTCAACG 


11100 


TGCGATCCCA 


CCCAAACACC 


CCCCCCAAAC 


11160 


TGCCTTCCCT 


AACGTCGATA 


GCTGAGGAAT 


11220 


TCAGTCCTCG 


GCCCCGCAGT 


GGCCCGACGG 


11280 


GTTCGGCAGG 


CCCGCCCCAC 


CGCCCCCATC 


11340 


TATTTGCCTA 


TATGCGTGTG 


TTGGATCCCT 


11400 


TGGGAGGCGG 


GTAATGGATG 


GGCGGGGCCC 


11460 


GTGTCGGAGA 


GGCCCGCCGC 


GCATTTAAGG 


11520 


GACTTGGTGC 


GCCGCCGTCA 


GCTAGTCTCC 


11580 


CGAACATGGC 


TCGCGGGGCC 


GGGTTGGTGT 


11640 


TGGCGGCAGC 


ACCCAGAACG 


TCCTGGAAAC 


11700 


TTCCGGCGCC 


CGCGGGGCCG 


GAGGAACGCA 


11760 


AACCCCTGGA 


TGCCTGCGGT 


CCCCTGCGCC 


11820 


GGGTGCTCGA 


GACGGTCGTG 


GATGCGGCGT 


11880 


CATACAGTCC 


CCCGTTCCCC 


GCGGGCGACG 


11940 


ATCGCGTAGC 


CGTGGTCAAC 


GAGAGTCTGG 


12000 


GTCTGTACAC 


CCTGTCCGTG 


GTCGGCCTAA 


12060 


TTCTGGTCGT 


GGAGCCCGCC 


CCTGTGCCGA 


12120 


ACGACGCGGG 


CGTGAGCGAA 


CGCACGCCGG 


12180 


GTCCCCCCGT 


CGCCCCCCCG 


ACGCACCCTC 


12240 


GGGTAACGGT 


CCATATGGAG 


ACCCCGGAGG 


12300 


GGACGAACGT 


CTCCATCCAC 


GCCATTGCCC 


12360 


TCTGGATGCG 


GTTTGACGTG 


CCGTCCTCGT 


12420 


TGTATCACCC 


GCAGCTTCCA 


GAGTGTCTAT 


12480 


CCTGGGCGTA 


CCGCCTGGCG 


GTCCGCAGCT 


12540 


CGCGATGTTT 


TGCCGAGGCT 


CGCATGGAAC 


12600 


CCGTCAATCT 


GGAATTCCAG 


CACGCCTCCC 


12660 


TGTACGTGGA 


CGATCATATC 


CACGCCTGGG 


12720 


ACCGGAACGC 


GGTGGTGGAA 


CAGCACCTCC 


12780 


CCCGCCCGCA 


CGTGAGAGCC 


CCCCCTCCCG 


12840 
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CGCCCTCCGC GCGCGGCCCG CTGCGCCTCG 
CCGCCCTCGG GCTGTCCGCG TGGGGCGTGC 
GCGGTTAAAA GCCGGGCCTC GGCGAGGGGC 
CTGTACGCGG ACTGGAGTTC GGACAGCGAG 
5 CCTCCGGAGA GACCCGACTC TCCCTCCACA 
ACGGCTCCGT CTGTATACCC CCATAGCGAG 
TTTGGTTCGG GAAGCCCGGG CCGTCGTCAC 
TAAGGCGTCT TCCGACGACG CGGACGTCGG 
CGGGGACCCG CCAGAGACCC TGCCCCCCGG 

10 ACCACCTCGC GGCGGCTCCC CCGCGGCCTT 
CTCCCAATTC GTCACCTGGC TCGCCGTGCG 
CGTCCTGTGC GGGATTGCGT TTTACGTGAC 
CGGCCAGCCC CGCCGCAGCT CATAAAAATC 
GGCCCTCCGC CAGCGCCCTT CCGCGTCCGC 

15 GGTGCGTTCC AGCGCGTCGG TGCCGCTTTA 
CTACTACTCG GAAAGCGAAG ACGAGGCCGC 
GCAGTCGGTC CTAAGGCGCC GACGGCGGCG 
TCTCGTCGTG GCCCTCCTAT CTGGAGGGTT 
AATGACGCCT CGATGTATGG CGCCTTCTTC 

20 TATGTTAATT GCAATAAAGT GGTTGATTGT 
TTGGGGGGGG GGGGAAGGAA ATGCAGAAAA 
GGGGGGGGCA AGTGCAGTAC CCCAGTTCCT 
GGCTCCTCCG TGTTAGGGAA GTCTCTTGGG 
AGGAGGGGGC AGTGCAGTAC CCCAGTTCCT 

25 GGGCTCCAGA CGAAGGACCC ATACATTTCC 
GTCACGACGC ATTTGCCCCC GTCCCCGCAG 
TTTTATTATT AATTACACCA ACCACCCTGT 
CACCCAAACG CACGAAACAA ATGCTGGCAG 
GTCGACGCGT GCGCCAAACA GCACCAGAAG 

30 CGATGTGTTT GGACGCAGGG CGCAGCCGCG 
GGCCACAGCG GCGAGGACTC CCTGTTGGCC 
TCCTGTCCGA CCCTGGCTCC CGCCAGCGGG 
AGGCGGGACA CCTCGATCAC CGTCCGAAAA 
AGGTGCGCCA GGGCCTGGGC GTTGAGAGGG 

35 CGGTCCGCGC GGTGCCGCGA GTGGGCACGC 
TTACGGATCC CGACGCGGGG CAGAACGTAC 
CTGCCGAGGG GGGCGTAGGG GACCGGGCTA 
TGGGGGTCTA GGCTCCCTGG GCACCCGTGG 
GCGCGGGACC CTGGGTTCTC TGGGAGATCG 

40 AACCCGGGGC TCCCTGGGGA CACGTGGTGC 
AGATGGCTTC GGGATCGAGA GGGCCGCACA 
CCCCGCCGCC GGATCATGGT CGCCGCCCCG 
CCCTTCAAGC GTGTCCGCTC CTCTGGGCTG 
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GGGCGGTGCT 


GGGGGCGGCC 


CTGTTGCTGG 


12900 


ATGACCTGCT 


GGCGCAGGCG 


CTCCTGGCGG 


12960 


CCCACTTACA 


TTCGCGTGGC 


GGACAGCGAG 


13020 


GGGGAGCGCG 


ACGGGTCCCT 


GTGGCAGGAC 


13080 


AATGGATCCG 


GCTTTGAGAT 


CTTATCACCA 


13140 


GGGCGTAAAT 


CTCGCCGCCC 


GCTCACCACC 


13200 


TCCCAGGCCT 


CCTATTCGTC 


CGTCCTCTGG 


13260 


CGATGAACTG 


ATTGCCATCG 


CGGACGCACG 


13320 


CGCGGGCGGC 


GCCGCGCCCG 


CGTGCCGCAG 


13380 


TCCCGTGGCC 


CTCCACGCCG 


TGGACGCCCC 


13440 


CTGGCTGCGG 


GGGGCGGTGG 


GTCTCGGGGC 


13500 


GTCAATCGCC 


CGAGGCGCAT 


AAAGGTCCGG 


13560 


GTGAGTCACG 


GCAACCGCAC 


CTTCGCCTCC 


13620 


GATGACCTCC 


CGGCCCGCCG 


ACCAAGACTC 


13680 


CCCCGCGGCC 


TCGCCCGTCC 


CGGCAGAAGC 


13740 


CAACGACTTC 


CTCGTGCGCA 


TGGGCCGCCA 


13800 


CACGCGGTGC 


GTCGGGCTGG 


TTATCGCCTG 


13860 


CGGGGCACTT 


TTGGTGTGGC 


TGCTCCGCTA 


13920 


GCCCCCACCC 


CTCGCCGCGA 


CCCACGTCCG 


13980 


CATTACGGTC 


TACTAGGTTG 


TCTTTTTTTT 


14040 


GGGTAAGAAA 


TTCTCGGAAT 


TTCACCCCCC 


14100 


CAGTGTTTGG 


GAAATCTATT 


GAACTCTCCC 


14160 


GAAATCTATT 


GACCTCTCGC 


CCCCCCCCCC 


14220 


CCGTGCTGGG 


GAAATCTCTC 


TGCCGGGTAC 


14280 


CCATCCGCAC 


CCCACATCTG 


GCGTTCTAGA 


14340 


CAACACACAA 


AGCGATTTCA 


ATTTTCACGA 


14400 


CCCCGGGACG 


TGGTCAGGAC 


CGGGGGTCCG 


14460 


TGTGCCGAAT 


ATAACCCCGC 


GTAGGAACAC 


14520 


GCGCATGCCA 


TCAGCAGGTC 


GTGCATATGG 


14580 


GCGATAAAAT 


TCATGGCGGC 


CGTCCGCCAG 


14640 


CGAAGCCATT 


GGGTATGAAC 


CAGCTGCGCC 


14700 


GGCGGTGGGT 


CGTGGGTGTT 


GAGAGCACAC 


14760 


AAGGCCCGGT 


GGTCCGCGGG 


CAGCATCTGC 


14820 


TACAACTCGG 


AGCCGGGGGA 


CTCCGGGGGC 


14880 


TTTGGGGCCC 


GGGTGTCGGA 


CGCGGGCGCG 


14940 


GTGCGTTGGC 


GCGGCGATGA 


GGGGTCCGGG 


15000 


GGCAAGCCCG 


CGGGTTGCGC 


GGGGTTCCCG 


15060 


GGGTCGTGGG 


GGTCGCGGGT 


CCCTGGGTAT 


15120 


TGGAACTCGC 


GGTTCCCTGG 


GCTCTCGGGG 


15180 


CCTGGGAATT 


CTTGATGGTC 


GGACGGCTTC 


15240 


GACTCGTAGT 


AGACCCGAAT 


CTCCACGTTT 


15300 


GTGCGGGGGC 


CCGTCGGTCG 


GAAGCGAGTG 


15360 


CATGCCGTCG 


GATGGGGTGC 


CTTTTAAGGA 


15420 
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AAGGTCTCGG CTGCCCGCCC CAACCGGGGT TTGGGGGTGG GCCGGGGAAA CCCCGGATGC 15480 

CATGGGGGGG GTCACACCCT AAGCGCCGGC GCGCTGGTTG GGTGGGGGTA GAGGGGAGTC 15540 

CCCGGTCGAC GAGATCGTAT CAAGGGGCCA GC ACGCGATC CTGCCGCTCG TTCGATCTAG 15600 

CACACCCACG GGTCTGCTGT GTGGGATTTC GACTCGCGGG ATCCGATCGC ACGTCCGGAG 15660 

5 GACACAGCAG CGGGAGCTCC GGGTCGGTCA CCGCAGTTCT GGCCGCCTCT CGGTCCTCCC 15720 

GTTCCCTTTT ATGGATCTCC GCGCAGACAT CGCCATACGT CCGGTGTGTG CACCGCGAAG 15780 

AATCCAAAAA CATGTCCGTC GTTTTCAGGG CCCAAGACAT GGTGTCCCGT CCACGAAGGC 15840 

GGCGCCCGGC CTGCGAGAAA GCGCGGATGT TGGGATCGGG GCCCCCCCGT CCCGTCCCC 15900 

10 (2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

Met Ala Asp He Pro Pro Asp Pro Pro Ala Leu Asn Thr Thr Pro Ala 

15 10 15 

Asn His Ala Pro Pro Ser Pro Pro Pro Gly Ser Arg Lys Arg Arg Arg 
25 20 25 30 

Pro Val Leu Pro Ser Ser Ser Glu Ser Glu Gly Lys Pro Asp Thr Glu 

35 40 45 

Ser Glu Ser Ser Ser Thr Glu Ser Ser Glu Asp Glu Ala Gly Asp Leu 
50 55 60 

30 Arg Gly Gly Arg Arg Arg Ser Pro Arg Glu Leu Gly Gly Arg Tyr Phe 
65 70 75 80 

Leu Asp Leu Ser Ala Glu Ser Thr Thr Gly Thr Glu Ser Glu Gly Thr 

85 90 95 

Gly Pro Ser Asp Asp Asp Asp Asp Asp Ala Ser Asp Gly Trp Leu Val 
35 100 105 110 

Asp Thr Pro Pro Arg Lys Ser Lys Arg Pro Arg He Asn Leu Arg Leu 

115 120 125 

Thr Ser Ser Pro Asp Arg Arg Ala Gly Val Val Phe Pro Glu Val Trp 
130 135 140 

40 Arg Asn Asp Arg Pro He Arg Ala Ala Gin Pro Gin Ala Pro Ala Gin 
145 150 155 160 

Ser Ser Gly Asp Arg Ala Ala Ala Pro Arg Arg Ser Ala Arg Gin Ala 
165 170 175 

472 



WO 98/20016 



PCT/US97/20016 



Gin Met Arg Ser Gly Ala Ala Tip Thr Leu Asp Leu His Tyr lie Arg 

180 185 190 

Gin Cys Val Asn Gin Leu Phe~ Arg lie Leu Arg Ala Ala Pro Asn Pro 
195 200 205 

5 Pro Gly Ser Ala Asn Arg Leu Arg His Leu Val Arg Asp Cys Tyr Leu 
210 215 220 

Met Gly Tyr Cys Arg Thr Arg Leu Gly Pro Arg Thr Trp Gly Arg Leu 
225 230 235 240 

Leu Gin lie Ser Gly Gly Thr Trp Asp Val Arg Leu Arg Asn Ala lie 
10 245 250 255. 

Arg Glu Val Glu Ala Arg Phe Glu Pro Ala Ala Glu Pro Val Cys Glu 

260 265 270 

Leu Pro Cys Leu Asn Ala Arg Arg Tyr Gly Pro Glu Cys Asp Val Gly 
275 280 285 

15 Asn Leu Glu Thr Asn Gly Gly Ser Thr Ser Asp Asp Glu lie Ser Asp 
290 295 . 300 

Ala Thr Asp Ser Asp Asp Thr Leu Ala Ser His Ser Asp Thr Glu Gly 
305 310 315 320 

Gly Pro Ser Pro Ala Gly Arg Glu Asn Pro Glu Ser Ala Ser Gly Gly 
20 325 330 335 

Ala lie Ala Ala Arg Leu Glu Cys Glu Phe Gly Thr Phe Asp Trp Thr 

340 345 350 

Ser Glu Glu Gly Ser Gin Pro Trp Leu Ser Ala Val Val Ala Asp Thr 
355 360 365 

25 Ser Ser Ala Glu Arg Ser Gly Leu Pro Ala Pro Gly Ala Cys Arg Ala 
370 375 380 

Thr Glu Ala Pro Glu Arg Glu Asp Gly Cys Arg Lys Met Arg Phe Pro 
385 390 395 400 

Ala Ala Cys Pro Tyr Pro Cys Gly His Thr Phe Leu Arg Pro 
30 405 410 

(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 287 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
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Met Gly Val Val Val Val Ser Val Val Thr Leu Leu Asp Gin Arg Asn 

1 5 10 15 

Ala Leu Pro Arg Thr Ser Ala Asp Asp Ala Leu Trp Ser Phe Leu Leu 
20 25 30 

5 Arg Gin Cys Arg lie Leu Ala Ser Glu Pro Leu Gly Thr Pro Val Val 
35 40 45 

Val Arg Pro Ala Asn Leu Arg Arg Leu Ala Glu Pro Leu Met Asp Leu 

50 55 60 

Pro Lys Phe Trp lie Val Arg Thr Arg Ser Cys Arg Cys Pro Pro Asn 
10 65 70 75 80 

Thr Thr Thr Gly Leu Phe Ala Glu Asp Asp Pro Leu Glu Ser lie Glu 

85 90 95 

lie Leu Asp Ala Pro Ala Cys Phe Arg Leu Leu His Gin Glu Arg Pro 
100 105 110 

15 Gly Pro His Arg Leu Tyr His Leu Trp Val Val Gly Ala Ala Asp Leu 
115 120 125 

Cys Val Pro Phe Leu Glu Tyr Ala Gin Lys Thr Arg Leu Gly Phe Arg 

130 135 140 

Phe lie Ala Met Lys Thr Asn Asp Ala Trp Val Gly Glu Pro Trp Pro 
20 145 150 155 160 

Leu Pro Asp Arg Phe Leu Pro Glu Arg Thr Val Ser Trp Thr Pro Phe 

165 170 175 

Pro Ala Ala Pro Asn His Pro Leu Glu Asn Leu Leu Ser Arg Tyr Glu 
180 185 190 

25 Tyr Gin Tyr Gly Val Val Val Pro Gly Asp Arg Glu Arg Ser Cys Leu 
195 200 205 

Arg Trp Leu Arg Ser Leu Val Ala Pro His Asn Lys Pro Arg Pro Ala 

210 215 220 

Ser Ser Arg Pro His Pro Ala Thr His Pro Thr Gin Arg Pro Cys Phe 
30 225 230 235 240 

Thr Cys Met Gly Arg Pro Glu lie Pro Asp Glu Pro Ser Trp Gin Thr 

245 250 255 

Gly Asp Asp Asp Pro Gin Asn Pro Gly Pro Pro Leu Ala Val Gly Asp 
260 265 270 

35 Glu Trp Pro Pro Ser Ser His Val Cys Tyr Pro lie Thr Asn Leu 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 176: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

474 



WO 98/20016 



PCT/US97/20016 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Val Gly Gly Cys Val Asp Lys Leu Pro Leu Leu Lys Thr Pro Gly Pro 

15 10 15 

Val Arg Ala Arg Trp Leu Ala Arg Ala Thr Arg Arg Met Ala Cys Arg 
10 20 25 30 

Lys Phe Cys Gly Val Tyr Arg Arg Pro Asp Lys Arg Gin Glu Ala Ser 

35 40 45 

Val Pro Pro Glu Thr Asn Thr Ala Pro Ala Phe Pro Ala Ser Thr Phe 
50 55 60 

15 Tyr Thr Pro Ala Glu Asp Ala Tyr Leu Ala Pro Gly Pro Pro Glu Thr 
65 70 75 80 

lie His Pro Ser Arg Pro Pro Ser Pro Gly Glu Ala Ala Arg Leu Cys 

85 90 95 

Gin Leu Gin Glu He Leu Ala Gin Met His Ser Asp Glu Asp Tyr Pro 
20 100 105 110 

He Val Asp Ala Ala Gly Ala Glu Glu Glu Asp Glu Ala Asp Asp Asp 

115 120 125 

Ala Pro Asp Asp Val Ala Tyr Pro Glu Asp Tyr Ala Glu Gly Arg Phe 
130 135 140 

25 Leu Ser Met Val Ser Ala Ala Pro Leu Pro Gly Ala Ser Gly His Pro 
145 150 155 160 

Pro Val Pro Gly Arg Ala Ala Pro Pro Asp Val Arg Thr Cys Asp Ser 

165 170 175 

Gly Lys Val Gly Ala Thr Gly Phe Thr Pro Glu Glu Leu Asp Thr Met 
30 180 185 190 

Asp Arg Glu Ala Leu Arg Ala He Ser Arg Gly Cys Lys Pro Pro Ser 

195 200 205 

Thr Leu Ala Lys Leu Val Thr Gly Leu Gly Phe Ala He His Gly Ala 
210 215 220 

35 Leu He Pro Gly Ser Glu Gly Cys Val Phe Asp Ser Ser His Pro Asn 
225 230 235 240 

Tyr Pro His Arg Val He Val Lys Ala Gly Trp Tyr Ala Ser Thr Asn 

245 250 255 

His Glu Ala Arg Leu Leu Arg Arg Leu Asn His Pro Ala He Leu Pro 
40 260 265 270 

Leu Leu Asp Leu His Val Val Ser Gly Val Thr Cys Leu Val Leu Pro 

275 280 285 

Lys Tyr His Cys Asp Leu Tyr Thr Tyr Leu Ser Lys Arg Pro Ser Pro 
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290 295 300 

Leu Gly His Leu Gin lie Thr Ala Val Ser Arg Gin Leu Leu Ser Ala 
305 310 - 315 320 

lie Asp Tyr Val His Cys Glu Gly He He His Arg Asp He Lys Thr 
5 325 330 335 

Glu Asn He Leu He Asn Thr Pro Glu Asn He Cys Leu Gly Asp Phe 

340 345 350 

Gly Ala Ala Cys Phe Val Arg Gly Cys Arg Ser Ser Pro Phe His Tyr 
355 360 365 

10 Gly lie Ala Gly Thr He Asp Thr Asn Ala Pro Glu Val Leu Ala Gly 
370 375 380 

Asp Pro Tyr Thr Gin Val He Asp He Trp Ser Ala Gly Leu Val He 
385 390 395 400 

Phe Glu Thr Ala Val His Thr Ala Ser Leu Phe Ser Ala Pro Arg Asp 
15 405 410 415 

Pro Glu Arg Arg Pro Cys Asp Asn Gin He Ala Arg He He Arg Gin 

420 425 430 

Ala Gin Val His Val Asp Glu Phe Pro Thr His Ala Glu Ser Arg Leu 
435 440 445 

20 Thr Ala His Tyr Arg Ser Arg Ala Ala Gly Asn Asn Arg Pro Ala Trp 
450 455 460 

Trp Ala Trp Thr Arg Tyr Tyr Lys He His Thr Asp Val Glu Tyr Leu 
465 470 475 480 

lie Cys Lys Ala Leu Thr Phe Asp Ala Ala Leu Arg Pro Ser Ala Ala 
25 485 490 495 

Glu Leu Leu Arg Leu Pro Leu Phe His Pro Lys 
500 505 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 177: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



Val Cys lie Ala Tyr His Gly Met Gly Arg Leu Thr Ser Gly Val Gly 

15 10 15 

Thr Ala Ala Leu Leu Val Val Ala Val Gly Leu Arg Val Val Cys Ala 
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20 

Lys Tyr Ala Asp Pro 
35 

Gly Lys Asn Leu Pro 
5 50 

Lys Arg Val Tyr His 
65 

Pro Ser. lie Pro lie 
85 

10 Arg Ser Val Leu Leu 
100 

Gly Ala Ser Asp Glu 
115 

Trp Tyr Arg Met Gly 
15 130 

Tyr Thr Glu Cys Pro 
145 

Thr Gin Pro Arg Trp 
165 

20 Asp Asn Leu Gly Phe 
180 

Thr Tyr Leu Arg Leu 
195 

Phe lie His Arg Ala 
25 210 

lie Pro Pro Ala Ala 
225 

Thr Val Asp Ser lie 
245 

30 Arg Thr Val Ala Lys 
260 

Pro Tyr Thr Ser Thr 
275 

Ala Thr Gin Pro Glu 
35 290 

Leu Glu Asp Pro Ala 
305 

His lie Pro Ser lie 
325 

40 Pro Ser Asn Pro Gly 
340 

Leu Val lie Gly Gly 
355 



25 

Ser Leu Lys Met Ala Asp 
- 40 

Val Leu Asp Gin Leu Thr 
55 

lie Gin Pro Ser Leu Glu 
70 75 
Thr Val Tyr Tyr Ala Val 
90 

His Ala Pro Ser Glu Ala 
105 

Ala Arg Lys His Thr Tyr 
120 

Asp Asn Cys Ala lie Pro 
. 135 

Tyr Asn Lys Ser Leu Gly 
150 155 
Ser Tyr Tyr Asp Ser Phe 
170 

Leu Met His Ala Pro Ala 
185 

Val Lys lie Asn Asp Trp 
200 

Arg Ala Ser Cys Lys Tyr 
215 

Cys Leu Thr Ser Lys Ala 
230 235 
Gly Met Leu Pro Arg Phe 
250 

Leu Lys lie Ala Gly Trp 
265 

Leu Leu Pro Pro Glu Leu 
280 

Leu Val Pro Glu Asp Pro 
295 

Gly Thr Val Ser Ser Gin 
310 315 
Gin Asp Val Ala Pro His 
330 

Leu lie lie Gly Ala Gly 
345 

lie Ala Phe Trp Val Arg 
360 
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30 

Pro Asn Arg Phe Arg 
45 

Asp Pro Pro Gly Val 
60 

Asp Pro Phe Gin Pro 
80 

Leu Glu Arg Ala Cys 
95 

Pro Gin He Val Arg 
110 

Asn Leu Thr He Ala 
125 

He Thr Val Met Glu 
140 

Val Cys Pro He Arg 
160 

Ser Ala Val Ser Glu 
175 

Phe Glu Thr Ala Gly 
190 

Thr Glu He Thr Gin 
205 

Ala Leu Pro Leu Arg 
220 

Tyr Gin Gin Gly Val 
240 

He Pro Glu Asn Gin 
255 

His Gly Pro Lys Pro 
270 

Ser Asp Thr Thr Asn 
285 

Glu Asp Ser Ala Leu 
300 

He Pro Pro Asn Trp 
320 

His Ala Pro Ala Ala 
335 

Ser Thr Leu Ala Val 
350 

Arg Arg Ala Gin Met 
365 
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5 



15 



Ala Pro Lys Arg Leu Arg Leu Pro His lie Arg Asp Asp Asp Ala Pro 

370 375 380 

Pro Ser His Gin Pro Leu Phe -Tyr 
385 390 

(2) INFORMATION FOR SEQ ID NO: 178: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 392 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:178: 



Val Cys lie Ala Tyr His Gly Met Gly Arg Leu Thr Ser Gly Val Gly 
15 10 15 

20 Thr Ala Ala Leu Leu Val Val Ala Val Gly Leu Arg Val Val Cys Ala 
20 25 30 

Lys Tyr Ala Asp Pro Ser Leu Lys Met Ala Asp Pro Asn Arg Phe Arg 

35 40 45 

Gly Lys Asn Leu Pro Val Leu Asp Gin Leu Thr Asp Pro Pro Gly Val 
25 50 55 60 

Lys Arg Val Tyr His lie Gin Pro Ser Leu Glu Asp Pro Phe Gin Pro 
65 70 75 80 

Pro Ser lie Pro He Thr Val Tyr Tyr Ala Val Leu Glu Arg Ala Cys 
85 90 95 

30 Arg Ser Val Leu Leu His Ala Pro Ser Glu Ala Pro Gin He Val Arg 
100 105 110 

Gly Ala Ser Asp Glu Ala Arg Lys His Thr Tyr Asn Leu Thr He Ala 

115 120 125 

Trp Tyr Arg Met Gly Asp Asn Cys Ala He Pro He Thr Val Met Glu 
35 130 135 140 

Tyr Thr Glu Cys Pro Tyr Asn Lys Ser Leu Gly Val Cys Pro He Arg 
145 150 155 160 

Thr Gin Pro Arg Trp Ser Tyr Tyr Asp Ser Phe Ser Ala Val Ser Glu 
165 170 175 

40 Asp Asn Leu Gly Phe Leu Met His Ala Pro Ala Phe Glu Thr Ala Gly 
180 185 190 

Thr Tyr Leu Arg Leu Val Lys He Asn Asp Trp Thr Glu He Thr Gin 
195 200 205 
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Phe lie His Arg Ala Arg Ala Ser Cys Lys Tyr Ala Leu Pro Leu Arg 

210 215 220 

He Pro Pro Ala Ala Cys Leu, Thr Ser Lys Ala Tyr Gin Gin Gly Val 
225 230 235 240 

5 Thr Val Asp Ser He Gly Met Leu Pro Arg Phe He Pro Glu Asn Gin 

245 250 255 

Arg Thr Val Ala Lys Leu Lys He Ala Gly Tip His Gly Pro Lys Pro 

260 265 270 

Pro. Tyr Thr Ser Thr Leu Leu Pro Pro Glu Leu Ser Asp Thr Thr Asn 
10 275 280 285 

Ala Thr Gin Pro Glu Leu Val Pro Glu Asp Pro Glu Asp Ser Ala Leu 

290 295 300 

Leu Glu Asp Pro Ala Gly Thr Val Ser Ser Gin He Pro Pro Asn Trp 
305 310 315 320 

15 His He Pro Ser He Gin Asp Val Ala Pro His His Ala Pro Ala Ala 

325 330 335 

Pro Ser Asn Pro Gly Leu He He Gly Ala Gly Ser Thr Leu Ala Val 

340 345 350 

Leu Val He Gly Gly He Ala Phe Trp Val Arg Arg Arg Ala Gin Met 
20 355 360 365 

Ala Pro Lys Arg Leu Arg Leu Pro His He Arg Asp Asp Asp Ala Pro 

370 375 380 

Pro Ser His Gin Pro Leu Phe Tyr 
385 390 



25 



35 



(2) INFORMATION FOR SEQ ID NO: 179 : 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 429 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 



Val Tyr Leu Trp Ala Arg Val Gly Gly Trp Leu Gly Tyr Leu Gly Gly 
1 5 10 15 

40 Thr Trp Thr Pro His Lys Gly Ser Leu Glu Gly Gly Lys Leu Gly Gin 
20 25 30 

Phe He Gly Arg Glu Arg Gly Ala Arg Thr Ala Val Pro Thr He Ser 
35 40 45 
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His Arg Ala His Ser His Leu Asp Pro Ser Asp Pro Gly Met Pro Gly 

50 55 60 

Arg Ser Leu Gin Gly Leu Ala.. He Leu Gly Leu Trp Val Cys Ala Thr 
65 .70 , 75 80 

5 Gly Leu Val Val Arg Gly Pro Thr Val Ser Leu Val Ser Asp Ser Leu 

85 90 95 

Val Asp Ala Gly Ala Val Gly Pro Gin Gly Phe Val Glu Glu Asp Leu 

100 105 110 

Arg Val Phe Gly Glu Leu His Phe Val Gly Ala Gin Val Pro His Thr 
10 115 120 125 

Asn Tyr Tyr Asp Gly He He Glu Leu Phe His Tyr Pro Leu Gly Asn 

130 135 140 

His Cys Pro Arg Val Val His Val Val Thr Leu Thr Ala Cys Pro Arg 
145 150 155 160 

15 Arg Pro Ala Val Ala Phe Thr Leu Cys Arg Ser Thr His His Ala His 

165 170 175 

Ser Pro Ala Tyr Pro Thr Leu Glu . Leu Gly Leu Ala Arg Gin Pro Leu 

180 185 190 

Leu Arg Val Arg Thr Ala Thr Arg Asp Tyr Ala Gly Val Leu Arg Val 
20 195 200 205 

Trp Val Gly Ser Ala Thr Asn Ala Ser Leu Phe Val Leu Gly Val Ser 

210 215 220 

Ala Asn Gly Thr Phe Val Tyr Asn Gly Ser Asp Tyr Gly Ser Cys Asp 
225 230 235 # 240 

25 Pro Ala Gin Leu Pro Phe Ser Ala Pro Arg Leu Gly Pro Ser Ser Val 

245 250 255 

Tyr Thr Pro Gly Ala Ser Arg Pro Thr Pro Pro Arg Thr Thr Thr Ser 

260 265 270 

Pro Ser Ser Pro Arg Asp Pro Thr Pro Ala Pro Gly Asp Thr Gly Thr 
30 275 280 285 

Pro Ala Pro Ala Ser Gly Glu Arg Ala Pro Pro Asn Ser Thr Arg Ser 

290 295 300 

Ala Ser Glu Ser Arg His Arg Leu Thr Val Ala Gin Val He Gin lie 
305 310 315 320 

35 Ala He Pro Ala Ser He He Ala Phe Val Phe Leu Gly Ser Cys lie 

325 330 335 

Cys Phe He His Arg Cys Gin Arg Arg Tyr Arg Arg Pro Arg Gly Gin 

340 345 350 

lie Tyr Asn Pro Gly Gly Val Ser Cys Ala Val Asn Glu Ala Ala Met 
40 355 360 365 

Ala Arg Leu Gly Ala Glu Leu Arg Ser His Pro Asn Thr Pro Pro Lys 

370 375 380 

Pro Arg Arg Arg Ser Ser Ser Ser Thr Thr Met Pro Ser Leu Thr Ser 
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385 390 395 400 

lie Ala Glu Glu Ser Glu Pro Gly Pro Val Val Leu Leu Ser Val Ser 

405 - 410 415 

Pro Arg Pro Arg Ser Gly Pro Thr Ala Pro Gin Glu Val 
5 420 425 

(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 430 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

Met Arg Ala Gly Leu Val Phe Phe Val Gly Val Trp Val Val Ser Cys 
20 1 5 10 15 

Leu Ala Ala Ala Pro Arg Thr Ser Trp Lys Arg Val Thr Ser Gly Glu 

20 25 30 

Asp Val Val Leu Leu Pro Ala Pro Ala Gly Pro Glu Glu Arg Thr Arg 
35 40 45 

25 Ala His Lys Leu Leu Trp Ala Ala Glu Pro Leu Asp Ala Cys Gly Pro 
50 55 60 

Leu Arg Pro Ser Trp Val Trp Pro Pro Arg Arg Val Leu Glu Thr Val 
65 70 75 80 

Val Asp Ala Ala Cys Met Arg Ala Pro Glu Pro Leu Ala lie Ala Tyr 
30 85 90 95 

Ser Pro Pro Phe Pro Ala Gly Asp Glu Gly Ser Glu Leu Ala Trp Arg 

100 105 110 

Asp Arg Val Ala Val Val Asn Glu Ser Leu Val He Tyr Gly Ala Leu 
115 120 125 

35 Glu Thr Asp Ser Gly Thr Leu Ser Val Val Gly Leu Ser Asp Glu Ala 
130 135 140 

Arg Gin Val Ala Ser Val Val Leu Val Val Glu Pro Ala Pro Val Pro 
145 150 155 160 

Thr Pro Thr Pro Asp Asp Tyr Asp Glu Glu Asp Asp Ala Gly Val Ser 
40 165 170 175 

Thr Pro Val Ser Val Pro Pro Pro Thr Pro Pro Arg Arg Pro Pro Val 

180 185 190 

Ala Pro Pro Thr His Pro Arg Val He Pro Glu Val Ser His Val Arg 
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195 200 205 

Gly Val Thr Val His Met Pro Glu Ala He Leu Phe Ala Pro Gly Glu 

210 215 220 

Thr Phe Gly Thr Asn Val Ser He His Ala He Ala His Asp Asp Gly 
5 225 230 235 240 

Pro Tyr Ala Met Asp Val Val Trp Met Arg Phe Asp Val Pro Ser Ser 

245 250 255 

Cys Ala Glu Met Arg He Tyr Glu Ala Cys Leu Tyr His Pro Gin Leu 
260 265 270 

10 Pro Glu Cys Leu Ser Pro Ala Asp Ala Pro Cys Ala Val Ser Ser Trp 
275 280 285 

Ala Tyr Arg Leu Ala Val Arg Ser Tyr Ala Gly Cys Ser Arg Thr Thr 

290 295 300 

Pro Pro Pro Arg Cys Phe Ala Glu Ala Arg Met Glu Pro Val Pro Gly 
15 305 310 315 320 

Leu Ala Trp Leu Ala Ser Thr Val Asn Leu Glu Phe Gin His Asp Gin 

325 330 335 

His Ala Gly Leu Cys Val Val Tyr Val Asp Asp His He His Ala Trp 
340 345 350 

20 Gly His Met Thr He Ser Thr Ala Ala Gin Tyr Arg Asn Ala Val Val 
355 360 365 

Glu Gin His Leu Pro Gin Arg Gin Pro Glu Pro Val Glu Pro Trp His 

370 375 380 

Val Arg Ala Pro Pro Pro Ala Pro Ser Arg Pro Leu Arg Leu Gly Ala 
25 385 390 395 400 

Val Leu Gly Ala Ala Leu Leu Leu Ala Ala Leu Gly Leu Ser Ala Trp 

405 410 415 

Gly Val His Asp Leu Leu Ala Gin Ala Leu Leu Ala Gly Gly 
420 425 430 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 181: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 



Val His Ala Val Asp Ala Pro Ser Gin Phe Val Thr Trp Leu Ala Val 
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1 5 10 15 

Arg Trp Leu Arg Gly Ala Val Gly Leu Gly Ala Val Leu Cys Gly lie 

20 - 25 30 

Ala Phe Tyr Val Thr Ser lie Arg Ala 
5 35 40 

(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 85 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Met Thr Ser Arg Pro Ala Asp Gin Asp Ser Val Arg Ser Ser Ala 
20 1 5 . 10 15 

Val Pro Leu Tyr Pro Ala Asp Val Pro Ala Glu Ala Tyr Tyr Ser 

20 25 30 

Ser Glu Asp Glu Ala Ala Asn Asp Phe Leu Val Arg Met Gly Arg 
35 40 45 

25 Gin Ser Val Leu Arg Arg Arg Arg Arg Arg Thr Arg Cys Val Gly 
50 55 60 

Val lie Ala Cys Leu Val Val Leu Ser Gly Gly Phe Gly Ala Leu 
65 70 75 

Val Trp Leu Leu Arg 
30 85 

(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 296 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:183: 
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Met He Arg Arg Arg Gly Asn Val Glu He Arg Val Tyr Tyr Glu Ser 

15 10 15 

Val Arg Pro Ser Arg Ser Arg, Ser His Leu Lys Pro Ser Asp His Gin 
20 25 30 

5 Glu Phe Pro Gly His His Val Ser Pro Gly Ser Pro Gly Phe Pro Glu 
35 40 45 

Ser Pro Gly Asn Arg Glu Phe His Asp Leu Pro Glu Asn Pro Gly Ser 

50 55 60 

Arg Ala Tyr Pro Gly Thr Arg Asp Pro His Asp Pro His Gly Cys Pro 
10 65 70 75 80 

Gly Ser Leu Asp Pro His Gly Asn Pro Ala Gin Pro Ala Gly Leu Pro 

85 90 95 

Ser Pro Val Pro Tyr Ala Pro Leu Gly Ser Pro Asp Pro Ser Ser Pro 
100 105 HO 

15 Arg Gin Arg Thr Tyr Val Leu Pro Arg Val Gly He Arg Asn Ala Pro 
115 120 125 

Ala Ser Asp Thr Arg Ala Pro Lys Arg Ala His Ser Arg His Arg Ala 

130 135 140 

Asp Arg Pro Pro Glu Ser Pro Gly Ser Glu Leu Tyr Pro Leu Asn Ala 
20 145 150 155 160 

Gin Ala His Leu Gin Met Leu Pro Ala Asp His Arg Ala Phe Phe Arg 

165 170 175 

Thr Val He Glu Val Ser Arg Leu Cys Ala Leu Asn Thr His Asp Pro 
180 185 190 

25 Pro Pro Pro Leu Ala Gly Ala Arg Val Gly Gin Glu Ala Gin Leu Val 
195 200 205 

His Thr Gin Trp Leu Arg Ala Asn Arg Glu Ser Ser Pro Leu Trp Pro 

210 215 220 

Trp Arg Thr Ala Ala Met Asn Phe He Ala Ala Ala Ala Pro Cys Val 
30 225 230 235 240 

Gin Thr His Met His Asp Leu Leu Met Ala Cys Ala Phe Trp Cys Cys 

245 250 255 

Leu Ala His Ala Ser Thr Cys Ser Tyr Ala Gly Ser Ala His Cys Gin 
260 265 270 

35 His Leu Phe Arg Ala Phe Gly Cys Gly Pro Pro Val Leu Thr Thr Ser 
275 280 285 

Arg Gly Gin Gly Gly Trp Cys Asn 
290 295 

40 (2) INFORMATION FOR SEQ ID NO: 184: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 178 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear- 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184 : 

ATGCACGTGT AACCGCCAGT CCGTGCTTGC CTAGCGAACT CACCCGTCCC GGCTGGCGTG 60 
CGCAGCCCGG GCCGTGTTGC GGGCCCTCTT AAGGGGCGGC GGCAGGACGG GGACTCCGCC 120 
10 CCGCCTCCTT TCCCCCGGGG AGTCAACCCC CGGGGGGGTG TATTCTGGGG GGGGGGT 178 

(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 2116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 



GACCCAGATC CCACCCCCGC CCGCAACGGG GCGCCGCCGC TGCTGCTGCT CCGCGGGGCG 60 

CCAGGGGGCG CCGGTCGGGT CGCGGCGGGC TGGGAGGTTC CGCGGGTCGC CCCCGCACCG 120 

25 CCGCCCCCGC GCCGGGGCGC TCTTCGGGGG GCGGGCGGGA CGTAGTCCGC TGCAGAGGGA 180 

GACAGAGACG GGAACCCCCG GTTAGTGCCN GACCCCCGCC CGACCCCCGC CCAGTGCCCG 240 

ACCCCCGCCC GACCCCCGCC CGACCCCCGC CCAGTGCCCG ACCCCCGCCC . AGTGCCCGAC 300 

CCCCGCCCAG TGCCCGACCC CCGCCCAGTG CCCGACCCCC GCCCGCCCTC ACCGTCGGCC 360 

AGGTCATCGT CCTCGTCGTC CGTGCCGGGC CACGGGGGGG TGGGCGACAG GGCGCGGACC 420 

30 GTGTGTCCCC CCAGCGACAG GGAGCGCGGG GCCGTCCGCG GGTTGCCCGT CCAGATAAAG 480 

TCCACGGCCG TGCCGGACCG CACGGCCGCC TCGGCCTCCA CGCGGGTCCG GGGGTCGTTC 540 

ACTATCGGGA TGGTGCTGAA CGACCCGCTG GCGGTCACGC CCACTATCAG GTACGCCACC 600 

GGGGTGTTGC ACAGGGGACA CGTGTTGCGC AACGGAATCC AGGTCTTCAT GCACGGGATG 660 

CAGAAGGGGT GCAGGCAGGG AAAACTCTGG CAGCGCAGGG GCGGGGCGAT CTCGTCCGTG 720 

35 CACACGGCAC ACACGTCGCC CCCCCCTCCC GCTTCCGCTT CCTCCTCACC CACGGGCCCA 780 

CCCCCGCAGG ATCCCTGCGC GTCGGCGGGC GTGGGGCTGC CCTGGCGCTC GGCCGGGGGC 840 

CGGGCCGGGG GCGTGGCCGC GTCCATCAGG CCCGCCTCGA ACATCTCCGT GTCCGTGCTG 900 

CCCGCCTCGG AGGTGGAGTC GCGGTGAAGG TCGTCGTCAG AGATTCCCAC CTCGGTCTCC 960 

TCCTCCGAGT CGCTGCTGGC GAGCCACTGC ATGTCGTTGA GCATCCCCCA GGCGTGCGGG 1020 

40 GCGGCGGGCT GCTTGACAAA GAAACGGGGG GGGATTTAGA GGGCGCGGGG CGTGAGGCGG 1080 

GACCCCCGTG CCGTGTCCCC CGTGTCCCTC CCTCACCCCG GCCCCCCGCC CGCTGCTTTT 1140 

TGTTCGGAAG GGGGGGAGAA AGGGGTCCGT AACCAAAGGT GGTCTGCGTC CTTTGGATTC 1200 

CGACCCCTCG TCTCCCCCCC CCTGTCCCCC GCTCTCGGGC TCAGGGCTCC CTGCCTCCCT 1260 
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CGGCCCCCCA AAGGGTCGGG GGGCGGCGCA CGGCCCACGG GGGTCCCCCG ACCGCTTAAG 1320 

CGGGCCGGGG GTCGGCCCCG TCAAGCGTCC CCGCCCCCGA GCCCACCGCC CGCGACCACC 1380 

CCCAACCCGC AGCCGGGTGG TCCGGGGAAA AGGGGGGGCC TGAGACCCGG GGGTCGCCCT 1440 

CTCACCGTGC CGGGGGTCTG CCGCGGCGGC CGCTCGGGGC CGGGGTCCGC CCGGGAGCTC 1500 

5 GTGCCGGGCC GGGGTTCCAT GAGCCGGGGT AGGGTAGACT CGAGACGGCG GCCCGCGGTC 1560 

TCTCTCTTGC CGGGTGTTAG TCTCTGTCTC TCCGGGTCTC CTCCTCCCGC CGGGCCGCCG 1620 

CTCCGTCGCT CGCAGTGCCG GGGTGCGAAT GCGGCCCGAC CGTCACACGG GGCTGCCTTA 1680 

TACCCGGCGC CTATCCACTC CCCCAAAGGG GCGGCATTTA CGATTCCCCC AATAGCCGCG 1740 

CGCCCCGGCG GGGGCGGAGG GAGGGAATCC CCCCCTCTCG GGGCGGCCCC GTCCCCGGGG 1800 

10 ACCAACCGGG TGTACTCCAA GAACCCCATT AGCATGCGCC GCCCCCCGCC GACGCAGATG 1860 

GGAGTCCCCC CGGCGCCCCG CCGGCGCGGC CCTGAGTGGT GCCCGCCCCC GGGGAGAAAT 1920 

TCATTAGCAT ACTAGGAAGC CCAGGGGACC AATAGGGGCC GATCAGCCCA CCCACCCGGC 1980 

GGCGCGCGAG GCTCTGCGTG TTCTGCCAAG AAAGTAATCA GCATAACCCG GAACCCCGAG 2040 

GGAGTAATTA CGCGGGGAGC GAGGGGCCGT CCGAACGTTT TTAATTACCA TAAGCGGGAA 2100 

15 TGGCGGCCCG TTAAA 2116 

(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 338 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Met Leu Asn Asp Met Gin Trp Leu Ala Ser Ser Asp Ser Glu Glu Glu 
30 1 5 10 15 

Thr Glu Val Gly lie Ser Asp Asp Asp Leu His Arg Asp Ser Thr Ser 

20 25 30 

Glu Ala Gly Ser Thr Asp Thr Glu Met Phe Glu Ala Gly Leu Met Asp 
35 40 45 

35 Ala Ala Thr Pro Pro Ala Arg Pro Pro Ala Glu Arg Gin Gly Ser Pro 
50 55 60 

Thr Pro Ala Asp Ala Gin Gly Ser Cys Gly Gly Gly Pro Val Gly Glu 
65 70 75 80 

Glu Glu Ala Glu Ala Gly Gly Gly Gly Asp Val Cys Ala Val Cys Thr 
40 85 90 95 

Asp Glu lie Ala Pro Pro Leu Arg Cys Gin Ser Phe Pro Cys Leu His 

100 105 110 

Pro Phe Cys He Pro Cys Met Lys Thr Trp He Pro Leu Arg Asn Thr 
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115 120 125 

Cys Pro Leu Cys Asn Thr Pro Val Ala Tyr Leu lie Val Gly Val Thr 

130 135 140 

Ala Ser Gly Ser Phe Ser Thr lie Pro lie Val Asn Asp Pro Arg Thr 
5 145 150 155 160 

Arg Val Glu Ala Glu Ala Ala Val Arg Ser Gly Thr Ala Val Asp Phe 

165 • 170 175 

lie Trp Thr Gly Asn Pro Arg Thr Ala Pro Arg Ser Leu Ser Leu Gly 
180 185 190 

10 Gly His Thr Val Arg Ala Leu Ser Pro Thr Pro Pro Trp Pro Gly Thr 
195 200 205 

Asp Asp Glu Asp Asp Asp Leu Ala Asp Gly Glu Gly Gly Arg Gly Ser 

210 215 220 

Gly Thr Gly Arg Gly Ser Gly Thr Gly Arg Gly Ser Gly Thr Gly Arg 
15 225 230 235 240 

Gly Ser Gly Thr Gly Arg Gly Ser Gly Gly Gly Arg Ala Gly Val Gly 

245 250 255 

His Trp Ala Gly Val Gly Arg Gly Xaa Gly Thr Asn Arg Gly Phe Pro 
260 265 270 

20 Ser Leu Ser Pro Ser Ala Ala Asp Tyr Val Pro Pro Ala Pro Arg Arg 
275 280 285 

Ala Pro Arg Arg Gly Gly Gly Gly Ala Gly Ala Thr Arg Gly Thr Ser 

290 295 300 

Gin Pro Ala Ala Trp Ala Pro Pro Gly Ala Pro Arg Ser Ser Ser Ser 
25 305 310 315 320 

Gly Gly Ala Pro Leu Arg Ala Gly Val Gly Ser Gly Ser Xaa Xaa Xaa 
325 330 335 

Xaa Xaa 



30 



(2) INFORMATION FOR SEQ ID NO: 187: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 642 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 



CGGCGGCGTT TCCGCGTTCC GTTTCTTCTC CCTCCCGGCC GCCCCGCTCC CGGGCCCGAC 
CCTCGCCCCT TCCCTTCTCC TCGTCTTCCC CCGTCCCGCC GCGCCCCTTC CCTCTTCCTT 

487 
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CTCTCTCTCT 


GTCTCGCTGT 


CTCGCTCTCC 


TCACATTTCC 


CCCCCCCCCC 


CCCGCCGCCG 


180 


CCGCCGCCCT 


CTGCCCGCGT 


CCCACCGAGA 


CGCCGCGCCG 


CGTGAGCCGT 


CCGCCGGGGG 


240 


ACCCAGGCTC 


CGGGGGGGGG 


GCGCGCCTGC 


GTGTGTCTCG 


TGTGAGAGAG 


CGCGCCCCTC 


300 


GAACGCCGCG 


CGTTCTCGCA 


GGTAGGTTTA 


GGGTCGTACA 


GGTGAGCTTC 


TGCTGAGGCG 


360 


GCGGGAGAGG 


GGGGGGCGGG 


CGGAAGAGAG 


AAGAGAGCAG 


GGGTTGGGGG 


AAAACTGTTC 


420 


TTCCTCCCCC 


TTTCAAGAAA 


CACGAGGCGG 


GGGTCCCAGA 


AAGGGCAGGC 


AGGTCAGCCG 


480 


CACCGCCCGC 


GAGCCAACCC 


GTATCCTTTT 


TTTCTAGGTG 


TTTTTGTTTT 


TGTTTCTGTT 


540 


TTTGTTTGTT 


TTGTTATTAT 


TTTCGCGGAT 


CCGGCGTGTT 


CGGATCCACC 


CCCCCCTTTC 


600 


TCCTTCCTCT 


TCCCTTCCAC 


CCACCCCCGT 


TTCCCCCCCC 


C 




642 



10 

(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 353 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 



CGCGCCCCCG 


CCCGGCCGCC 


GCGCGCCCCC 


GCCCGACCGC 


CGCGCGCCCC 


CGCCCGGCCG 


60 


CCGCGCGCCC 


CCGCCCGGCC 


GCCCGCGTCG 


CGCCGGCGCC 


CCCTCCCGGC 


GCTTCCGGGG 


120 


CCTTTCCTTC 


CTTCCCCGCC 


GCGACCCCGG 


CCCCGCCCCA 


CCGCCCCGCC 


CGGCAGGGGG 


180 


GCCCCGGCGC 


CGCGCAGAAC 


ACACAGACGA 


ACACACGGTG 


GCGATCTTTT 


CTTTACTTCG 


240 


GCAGACCAGC 


GAGCCCCGGC 


CCCGGCCCGC 


GCCCCGCCGC 


CACACCCACG 


GCACCCCCCC 


300 


CGCCGCCCAC 


CCCGGGGTCC 


ACACAGGAGC 


GCGCGGGCGG 


CAGAAACGCG 


GG 


353 



(2) INFORMATION FOR SEQ ID NO: 189: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6386 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

40 CACGCGCCGC ATTTGCGCCC GGGGCCCCGC GCTCCCCCCG GGCGGCCTGG CCGTCGGGGG 60 

CCAGATGTAC GTGAACCGCA ACGAGATCTT CAACGCCGCG CTGGCCGTTA CGAACATCAT 120 

CCTGGATCTG GACATCGCCC TGAAGGAGCC CGTCCCCTTT CCCCGGCTCC ACGAGGCCCT 180 

GGGTCACTTT AGGCGCGGGG CGCTGGCGGC GGTTCAGCTG TTGTTTCCCG CGGCCCGCGT 240 
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AGACCCCGAC GCCTATCCCT GTTATTTT.TT CAAAAGCGCC TGTCGGCCCC GCGCGCCGCC 300 

CGTCTGTGCG GGCGACGGGC CCTCGGCCGG TGGCGACGAC GGCGACGGGG ACTGGTTCCC 360 

CGACGCCGGT GGCGACGACG GCGACGAGGA GTGGGAGGAG GACACGGACC CCATGGACAC 420 

GACCCACGGC CCCCTCCCGG ACGACGAGGC CGCGTACCTC GACCTGCTAC ACGAACAGAT 480 

5 ACCAGCGGCG ACGCCCAGCG AACCGGACTC CGTCGTGTGT TCCTGCGCCG ACAAGATCGG 540 

GCTGCGCGTG TGCCTACCGG TCCCCGCCCC GTACGTTGTG CACGGCTCCC TGACGATGCG 600 

TGGGGTGGCG AGGGTGATCC AGCAGGCGGT GCTGTTGGAC CGCGACTTCG TGGAGGCCGT 660 

AGGGAGCCAC GTAAAGAACT TTTTGCTGAT CGATACGGGC GTGTACGCCC ACGGCCACAG 720 

CCTGCGCTTG CCGTATTTCG CCAAGATCGG CCCCGACGGC TCCGCGTGCG GCCGGTTATT 780 

10 GCCCGTCTTC GTGATCCCCC CCGCGTGCGA GGACGTTCCG GCGTTCGTCG CCGCGCACGC 840 

CGACCCGCGG CGCTTCCACT TTCACGCCCC GCCCATGTTT TCCGCGGCCC CGCGGGAGAT 900 

CCGCGTCCTC CACAGCCTGG GCGGGGACTA TGTCAGCTTT TTCGAGAAGA AGGCGTCGCG 960 

CAACGCCCTG GAGCACTTTG GGCGACGCGA GACCCTGACG GAGGTTCTGG GCCGCTACGA 1020 

TGTGCGGCCC GACGCCGGGG AGACCGTGGA GGGGTTCGCG TCAGAACTGC TGGGGCGAAT 1080 

15 AGTCGCGTGC ATCGAGGCCC ACTTTCCCGA GCACGCGCGG GAATATCAGG CCGTGTCCGT 1140 

TCGCCGGGCC GTCATTAAGG ACGACTGGGT CCTGCTGCAG CTGATCCCCG GCCGCGGCGC 1200 

CCTGAACCAA AGCCTCTCGT GTCTGCGCTT CAAGCACGGC AGGGCAAGTC GCGCGACGGC 1260 

CCGGACCTTT CTCGCGCTGA GCGTCGGGAC CAACAACCGC CTATGCGCGT CCCTGTGTCA 1320 

GCAGTGCTTT GCCACTAAAT GCGATAACAA CCGCCTGCAC ACGCTGTTTA CCGTCGATGC 1380 

20 . GGGCACGCCA TGCTCGCGGT CCGCTCCCTC CAGCACCTCA CGACCGTCAT CTTCATAACG 1440 

GCCTACGGCC TCGTGCTCGC GTGGTACATC GTCTTTGGTG CCAGTCCGCT CCACCGATGT 1500 

ATTTACGCGG TGCGCCCCGC CGGGGCACAC AACGATACCG CCCTCGTGTG GATGAAGATA 1560 

AACCAGACGC TGTTGTTTCT GGGCCCGCCG ACCGCCCCCC CCGGCGGGGC ATGGACCCCC 1620 

CACGCCCACG TCTGCTACGC CAATATCATC GAAGGTCGGG CCGTGTCCCT CCCGGCCATC 1680 

25 CCCGGCGCCA TGAGCCGCCG GGTCATGAAC GTGCACGAGG CCGTAAACTG CTTGGAGGCC 1740 

CTCTGGGACA CCCAGATGCG CCTGGTGGTC GTCGGTTGGT TTCTGTATCT AGCGTTCGTC 1800 

GCCCTTCACC AACGACGATG CATGTTCGGC GTCGTGAGTC CCGCGCACAG CATGGTGGCC 1860 

CCGGCGACCT ATCTTTTGAA CTACGCCGGC CGCATAGTGT CGAGCGTGTT CTTGCAATAC 1920 

CCCTACACGA AAATCACCCG CCTCCTCTGC GAGCTATCCG TTCAACGCCA GACCCTGGTG 1980 

30 CAGCTGTTCG AGGCGGATCC GGTCACCTTC TTGTACCACC GCCCGGCCGT TGGCGTCATC 2040 

GTGGGCTGCG AGCTGCTGCT CCGCTTCGTG GCCCTCGGTC TCATCGTCGG CACCGCTCTC 2100 

ATCTCCCGGG GCGCCTGCGC GATCACATAC CCCCTGTTTC TAACAATCAC CACCTGGTGT 2160 

TTCGTGTCCA TCATCGCCCT GACGGAGCTG TATTTCATCC TGCGGCGGGA CTCGGCCCCC 2220 

AAAAACGCGG AACCAGCGGC CCCCAGGGGG CGCTCCAAAG GGTGGTCGGG CGTCTGCGGG 2280 

35 CGCTGCTGTT CCATCATCCT CTCCGGTATC GCCGTGCGCC TGTGCTATAT CGCCGTCGTG 2340 

GCCGGGGTGG TGCTTATGGC GCTTCGCTAC GAACAGGAGA TTCAGCGGCG CCTGTTTGAT 2400 

CTGTGACGTA ACGCCTCTTC CGTTGGAAGA GGCGGACCCA GTCGCCCATG CAAATTAAAT 2460 

ACACGACCCG CCTCGGGCCT ACGCACCCTC GCACGTCGCA TGCAAATTAA AATCGTGCAC 2520 

AGAGCCGATC CGGCCTCGGG TCTGCTTGCC CCTCCCCCGG TCCAGCACAG GCAGGCTCGT 2580 

40 CCGACTTCCG CATACACCCC ACCCTACCGC GTGCTTCCGC ACCCCCGCCT ACGCGTGTAC 2640 

GCGAAGGCGG ACCCAGACCT GCCGTATGCT AATTAAATAC ATAAAACCCA CCCTCGGCGT 2700 

CCGATTGGTT TCTGGGGACG GCGGGGGCGG GGGCGGTGAC GCCCGACGGG GAGGGACAAG 2760 

GAGGAGTTTC GGAAAGCCGG CCCCGGTCGT GCGGGTATAA GGGCAGCCAC CGGCCCACTG 2820 
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GGCGCTGTGT GCTGCCGTGT GCCGACCCCG GTTGCGCGTC GGTGCCGCTC CTCGATTCGG 2880 

ACCCGGCCAC TCTCTTCCGA CACGCGCCCC CTCGGAGGAC ACCCGCCATC CCAGCCCCGG 2940 

CGACCTACAA CATGGCTACC GACATTGATA TGCTAATCGA CCTAGGATTG GACCTGTCCG 3000 

ACAGCGAGCT CGAGGAGGAC GCTCTGGAGC GGGACGAGGA GGGCCGCCGC GACGACCCCG 3060 

5 AGTCCGACAG CAGCGGGGAG TGTTCCTCGT CGGACGAGGA CATGGAAGAC CCCTGCGGAG 3120 

ACGGAGGGGC GGAGGCCATC GACGCGGCGA TTCCCAAAGG TCCCCCGGCC CGCCCCGAGG 3180 

ACGCCGGCAC CCCCGAAGCC TCGACGCCTC GCCCGGCAGC GCGGCGGGGA GCCGACGATC 3240 

CGCCACCCGC GACCACCGGC GTGTGGTCGC GCCTCGGGAC CAGGCGGTCG GCTTCCCCCC 3300 

GGGAACCGCA CGGGGGGAAG GTGGCCCGCA TCCAACCCCC GTCGACCAAG GCACCGCATC 3360 

10 CCCGAGGCGG GCGGCGAGGT CGCCGCCGGG GCCGGGGTCG ATACGGCCCC GGCGGCGCCG 3420 

ACTCCACACC AAACCCCCGC CGGCGCGTCT CCAGAAACGC CCACAACCAA GGGGGTCGCC 3480 

ACCCCGCGTC GGCGCGGACG GACGGCCCCG GCGCCACCCA CGGCGAGGCG CGGCGCGGAG 3540 

GGGAGCAGCT CGACGTCTCC GGGGGCCCGC GGCCACGAGG CACGCGCCAG GCCCCCCCTC 3600 

CGCTGATGGC GCTGTCCCTG ACCCCCCCGC ACGCGGACGG CCGCGCCCCG GTCCCGGAGC 3660 

15 GAAAGGCGCC CTCTGCCGAC ACCATCGACC CCGCCGTTCG GGCGGTTCTG CGATCCATAT 3720 

CCGAGCGCGC GGCGGTCGAG CGCATCAGCG AAAGCTTTGG ACGCAGTGCC CTGGTCATGC 3780 

AAGACCCCTT TGGCGGGATG CCGTTTCCCG CCGCGAACAG CCCCTGGGCT CCCGTGCTGG 3840 

CCACCCAAGC GGGGGGGTTT GACGCCGAGA CCCGTCGGGT TTCCTGGGAA ACCCTGGTCG 3900 

CTCACGGCCC GAGCCTCTAC CGCACATTCG CAGCCAACCC GCGGGCCGCG TCGACAGCCA 3960 

20 AGGCCATGCG CGACTGCGTG CTGCGCCAGG AAAATCTCAT CGAGGCCCTG GCGTCCGCGG 4020 

ATGAGACGCT GGCGTGGTGC AAGATGTGCA TTCACCACAA TCTGCCGCTC CGCCCCCAGG 4080 

ACCCTATCAT CGGAACGGCG GCCGCCGTGC TGGAAAACCT CGCCACGCGC CTGCGCCCCT 4140 

TTCTGCAGTG CTACCTGAAG GCCCGAGGCC TGTGCGGGCT GGACGACCTG TGCTCGCGGC 4200 

GACGCCTGTC GGACATTAAG GATATTGCCT CCTTTGTGTT GGTCATCCTG GCCCGCCTCG 4260 

25 CCAACCGCGT CGAGCGCGGC GTGTCGGAGA TCGACTACAC GACCGTGGGG GTTGGGGCCG 4320 

GCGAGACGAT GCACTTTTAC ATCCCGGGGG CCTGCATGGC GGGTCTCATT GAAATACTGG 4380 

ACACGCACCG CCAGGAGTGT TCCAGTCGCG TGTGCGAGCT GACGGCCAGT CACACTATCG 4440 

CCCCCTTATA TGTGCACGGC AAATACTTCT ACTGCAACTC CCTATTTTAG GCAAGAATAA 4500 

ACATATTGAC GTCAACCCAA GTGGTTCCGT GTGATGTTCT TGGCGCGCGC GGCGGGTGGG 4560 

30 GCGGAGACTC CGGGGCGATG CCGGCGTGCG CGTGGGAGGA GGGCGATGAC CCACCGGATA 4620 

AATGTGGGGC CCCGGCCCGG CCCGCTTCAT AGCGCGTCCA GGAACTCACG GCAGACGCGT 4680 

ATTCACCGAC CCCCCCCCTC GCAACATGAC AACGACGCCC CTCTCGAACC TGTTTTTACG 4740 

GGCCCCGGAC ATCACCCACG TCGCCCCCCC GTACTGTCTG AATGCCACGT GGCAGGCCGA 4800 

AAACGCCCTG CACACGACCA AAACGGACCC CGCGTGCCTG GCCGCGCGGA GTTATTTAGT 4860 

35 CCGCGCCTCC TGCTCGACCA GCGGCCCCAT CCACTGTTTT TTCTTTGCGG TGTACAAGGA 4920 

CTCGCAGCAC TCCCTTCCGC TGGTTACCGA GCTCCGCAAC TTCGCGGACC TGGTCAACCA 4980 

CCCGCCCGTC TTGCGCGAAC TAGAGGATAA GCGTGGGGGG CGGCTGCGGT GCACGGGCCC 5040 

ATTCAGCTGC GGAACCATCA AGGACGTCTC CGGTGCATCC CCCGCGGGGG AATACACGAT 5100 

AAACGGTATC GTGTACCACT GTCACTGTCG GTATCCGTTC TCCAAAACCT GCTGGCTCGG 5160 

40 GGCATCCGCG GCCCTACAAC ACCTTCGCTC TATAAGCTCA AGCGGCACGG CCGCTCGCGC 5220 

GGCAGAACAG CGACGCCACA AAATCAAAAT CAAAATCAAG GTATAACCCA CCCCCTTCCC 5280 

TCCGAGTCCG TATGCAACCT CATTAATAAA GAGTGAGAAC CAACCAAAAC AGACGCGGTG 5340 

TGAGTTTGTG GGTTATAGGA ACCCGGTAAA TACCACGCGA CGAACCAGCG TGTGTGTTAA 5400 
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CGCGACTTTT ATTCGTTGTA TCGCGGGAGG GGGGAAGCTT ACCGCCAAAG GAAGGCCAAG 5460 

ATGATAACGA CGACCACCGC GACCACCCCA AAAACCGCAT GACGACACGT CCCGCCACAC 5520 

CACCCTGGGG CTTGGGGCGT GTCGGAGCTC GACGCACAGC GGGCCGCGCG TTGGGCCCGG 5580 

TACAGCTCTC GCGAATTGAC GAGCGGGGGT CGCCACGTGC GCGAGCTTTG CACGCGGGGT 5640 

5 TGGTCGGCCG GCCCCACGGA CCCGCCCGGT GGCTCGGTCG GACATGCGGC CATGACCATG 5700 

GCGTAGGTGG GGGGGCGATC CGAGGTCGCC TCTGCGTAAG TAGGGAGGCC CGACGGGAGG 5760 

TCGCCTCCCA CGCCAGGGTG GGCCCCAATC ATAGTTTCCG GTAGAAACAG GGGGGTCTCC 5820 

ACAAACAACC CCCCTGGGCC AAAGCTCCGG CGCCGCGCCC GTCGTTCGGC GCGGCGCCTG 5880 

GCGCGCCGAG CGGCCCGCCA GGCGGCGCGG CGCGAGCGGC CACGCTCACA CACCTCGCCG 5940 

10 TCACCGGAAG AAGCCGGTGA AACAAGCCCA ACCGGCGACG TCCCTGCAGA GTACGGTGGA 6000 

GGCGAGTCCG TGGGGGTGTC GATATCAATA ACGACAAACT GGCCCGCGCT CGCGCCGGCC 6060 

ACACTCTCGT ATGGGGGCGG GGCGTCAATC ACGCTATCAT CTCCGTCATC CCTGCATGCG 6120 

TGGGCATGCC CAGCCCCCAA CGCCATGGTG GGGATTCGCG GCTCAGAAGC CTGCATGTCG 6180 

TGTGGTCGGT CGTAGTCCAA CGTGCCTCCC CCACCCACCA CACAGCCGGT CCCCACGCCG 6240 

15 ACCACTAGAC CGCAGACGTC GCCCAACCGA GGTCCCCGTG CACAGACCGC GCCTTTTATA 6300 

GCCCCAGGGG TTGCTAATTA ACGCACGCAT GCAGACGCAA TTTATTTTGC TCCCCCGCGT 6360 

CCTCCCCTCC CCCGCGTCCT CCCNT 6386 



20 



(2) INFORMATION FOR SEQ ID NO: 190: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 477 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

30 

Xaa Xaa Xaa Xaa Xaa Thr Arg Arg He Cys Arg Pro Ala Leu Pro Pro 

1 5 10 15 

Gly Gly Leu Ala Val Gly Gly Gin Met Tyr Val Asn Arg Asn Glu He 
20 25 30 

35 Phe Asn Ala Ala Val Thr Asn He He Leu Asp Leu Asp He Ala Leu 
35 40 45 

Lys Glu Pro Val Pro Phe Pro Arg Leu His Glu Ala Leu Gly His Phe 

50 55 60 

Arg Arg Gly Ala Ala Val Gin Leu Leu Phe Pro Ala Ala Arg Val Asp 
40 65 70 75 80 

Pro Asp Ala Tyr Pro Cys Tyr Phe Phe Lys Ser Ala Cys Arg Pro Arg 

85 90 95 

Ala Pro Pro Val Cys Ala Gly Asp Gly Pro Ser Ala Gly Gly Asp Asp 
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100 105 110 

Gly Asp Gly Asp Trp Phe Pro Asp Ala Gly Gly Asp Asp Gly Asp Glu 

115 120 125 

Glu Trp Glu Glu Asp Thr Asp Pro Met Asp Thr Thr His Gly Pro Leu 
5 130 . 135 140 

Pro Asp Asp Glu Ala Ala Tyr Leu Asp Leu Leu His Glu Gin lie Pro 
145 150 155 160 

Ala Ala Thr Pro Ser Glu Pro Asp Ser Val Val Cys Ser Cys Ala Asp 
165 170 175 

10 Lys He Gly Leu Arg Val Cys Leu Pro Val Pro Ala Pro Tyr Val Val 
180 185 190 

His Gly Ser Leu Thr Met Arg Gly Val Ala Arg Val He Gin Gin Ala 

195 200 205 

Val Leu Leu Asp Arg Asp Phe Val Glu Ala Val Gly Ser His Val Lys 
15 210 215 220 

Asn Phe Leu Leu He Asp Thr Gly Val Tyr Ala His Gly His Ser Leu 
225 230 235 240 

Arg Leu Pro Tyr Phe Ala Lys He Gly Pro Asp Gly Ser Ala Cys Gly 
245 250 255 

20 Arg Leu Leu Pro Val Phe Val He Pro Pro Ala Cys Glu Asp Val Pro 
260 265 270 

Ala Phe Val Ala Ala His Ala Asp Pro Arg Arg. Phe His Phe His Ala 

275 280 285. 

Pro Pro Met Phe Ser Ala Ala Pro Arg Glu lie Arg Val Leu His Ser 
25 290 295 300 

Leu Gly Gly Asp Tyr Val Ser Phe Phe Glu Lys Lys Ala Ser Arg Asn 
305 310 315 320 

Ala Leu Glu His Phe Gly Arg Arg Glu Thr Leu Thr Glu Val Leu Gly 
325 330 335 

30 Arg Tyr Asp Val Arg Pro Asp Ala Gly Glu Thr Val Glu Gly Phe Ala 
340 345 350 

Ser Glu Leu Leu Gly Arg He Val Ala Cys lie Glu Ala His Phe Pro 

355 360 365 

Glu His Ala Arg Glu Tyr Gin Ala Val Ser Val Arg Arg Ala Val He 
35 370 375 380 

Lys Asp Asp Trp Val Leu Leu Gin Leu He Pro Gly Arg Gly Ala Leu 
385 390 395 400 

Asn Gin Ser Leu Ser Cys Leu Arg Phe Lys His Gly Arg Ala Ser Arg 
405 410 415 

40 Ala Thr Ala Arg Thr Phe Leu Ala Leu Ser Val Gly Thr Asn Asn Arg 
420 425 430 

Leu Cys Ala Ser Leu Cys Gin Gin Cys Phe Ala Thr Lys Cys Asp Asn 
435 440 445 
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Asn Arg Leu His Thr Leu Phe Thr Val Asp Ala Gly Thr Pro Cys Ser 

450 455 460 

Arg Ser Ala Pro Ser Ser Thr Ser Arg Pro Ser Ser Ser 
465 470 475 

5 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 332 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:191: 



Met Leu Ala Val Arg Ser Leu Gin His Leu Thr Thr Val He Phe He 
1 5 10 15 

20 Thr Ala Tyr Gly Leu Val Leu Ala Trp Tyr He Val Phe Gly Asp Leu 
20 25 30 

His Arg Cys He Tyr Ala Val Arg Pro Ala Gly Ala His Asn Asp Thr 

35 40 45 

Ala Leu Val Trp Met Lys He Asn Gin Thr Leu Leu Phe Leu Gly Pro 
25 50 55 60 

Pro Thr Ala Pro Pro Gly Gly Ala Trp Thr Pro His Ala His Val Cys 
65 70 75 80 

Tyr Ala Asn He He Glu Gly Arg Ala Val Ser Leu Pro Ala lie Pro 
85 90 95 

30 Gly Ala Met Ser Arg Arg Val Met Asn Val His Glu Ala Val Asn Cys 
100 105 110 

Leu Glu Ala Leu Trp Asp Thr Gin Met Arg Leu Val Val Val Gly Trp 

115 120 125 

Phe Leu Tyr Leu Ala Phe Val His Gin Arg Arg Cys Met Phe Gly Val 
35 130 135 140 

Val Ser Pro Ala His Ser Met Val Ala Pro Ala Thr Tyr Leu Leu Asn 
145 150 155 160 

Tyr Ala Gly Arg He Val Ser Ser Val Phe Leu Gin Tyr Pro Tyr Thr 
165 170 175 

40 Lys He Thr Arg Leu Leu Cys Glu Leu Ser Val Gin Arg Gin Thr Leu 
180 185 190 

Val Gin Leu Phe Glu Ala Asp Pro Val Thr Phe Leu Tyr His Arg Pro 
195 200 205 
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Ala Val Gly Val lie Val Gly Cys Glu Leu Leu Leu Arg Phe Val Gly 

210 215 220 

Leu lie Val Gly Thr Ala Leu lie Ser Arg Gly Ala Cys Ala lie Thr 
225 230 235 240 

5 Tyr Pro Leu Phe Leu Thr He Thr Thr Trp Cys Phe Val Ser He lie 

245 250 255 

Ala Leu Thr Glu Leu Tyr Phe He Leu Arg Arg Asp Ser Ala Pro Lys 

260 265 270 

Asn Ala Glu Pro Ala Ala Pro Arg Gly Arg Ser Lys Gly Trp Ser Gly 
10 275 280 285 

Val Cys Gly Arg Cys Cys Ser He He Leu Ser Gly He Ala Val Arg 

290 295 300 

Leu Cys Tyr He Ala Val Val Ala Gly Val Val Leu Met Ala Leu Arg 
305 310 315 320 

15 Tyr Glu Gin Glu lie Gin Arg Arg Leu Phe Asp Leu 

325 330 

(2) INFORMATION FOR SEQ ID NO: 192: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 574 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:192: 

30 Val Thr Pro Asp Gly Glu Gly Gin Gly Gly Val Ser Glu Ser Arg Pro 
15 10 15 

Arg Ser Cys Gly Tyr Lys Gly Ser His Arg Pro Thr Gly Arg Cys Val 

20 25 30 

Leu Pro Cys Ala Asp Pro Gly Cys Ala Ser Val Pro Leu Leu Asp Ser 
35 35 40 45 

Asp Pro Ala Thr Leu Phe Arg His Ala Pro Pro Arg Arg Thr Pro Ala 

50 55 60 

He Pro Ala Pro Ala Thr Tyr Asn Met Ala Thr Asp He Asp Met Leu 
65 70 75 80 

40 He Asp Leu Gly Leu Asp Leu Ser Asp Ser Glu Leu Glu Glu Asp Ala 

85 90 95 

Leu Glu Arg Asp Glu Glu Gly Arg Arg Asp Asp Pro Glu Ser Asp Ser 
100 105 110 
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Ser Gly Glu Cys Ser Ser Ser Asp Glu Asp Met Glu Asp Pro Cys Gly 

115 120 125 

Asp Gly Gly Ala Glu Ala lie Asp Ala Ala lie Pro Lys Gly Pro Pro 
130 135 140 

5 Ala Arg Pro Glu Asp Ala Gly Thr Pro Glu Ala Ser Thr . Pro Arg Pro 
145 150 155 160 

Ala Ala Arg Arg Gly Ala Asp Asp Pro Pro Pro Ala Thr Thr Gly Val 

165 170 175 

Trp Ser Arg Leu Gly Thr Arg Arg Ser Asp Arg Glu Pro His Gly Gly 
10 180 185 190 

Lys Val Ala Arg lie Gin Pro Pro Ser Thr Lys Ala Pro His Pro Arg 

195 200 205 

Gly Gly Arg Arg Gly Arg Arg Arg Gly Arg Gly Arg Tyr Gly Pro Gly 
210 215 220 

15 Gly Ala Asp Ser Thr Pro Asn Pro Arg Arg Arg Val Ser Arg Asn Ala 
225 230 235 240 

His Asn Gin Gly Gly Arg His Pro Ala Ser Ala Arg Thr Asp Gly Pro 

245 250 255 

Gly Ala Thr His Gly Glu Ala Arg Arg Gly Gly Glu Gin Leu Asp Val 
20 260 265 270 

Ser Gly Gly Pro Arg Pro Arg Gly Thr Arg Gin Ala Pro Pro Pro Leu 

275 280 285 

Met Ala Leu Ser Leu Thr Pro Pro His Ala Asp Gly Arg Ala Pro Val 
290 295 300 

25 Pro Glu Arg Lys Ala Pro Ser Ala Asp Thr lie Asp Pro Ala Val Arg 
305 310 315 320 

Ala Val Leu Arg Ser lie Ser Ala Ala Val Glu Arg lie Ser Glu Ser 

325 330 335 

Phe Gly Arg Ser Ala Leu Val Met Gin Asp Pro Phe Gly Gly Met Pro 
30 340 345 350 

Phe Pro Ala Ala Asn Ser Pro Trp Ala Pro Val Leu Ala Thr Gin Ala 

355 360 365 

Gly Gly Phe Asp Ala Glu Thr Arg Arg Val Ser Trp Glu Thr Leu Val 
370 375 380 

35 Ala His Gly Pro Ser Leu Tyr Arg Thr Phe Ala Ala Asn Pro Arg Ala 
385 390 395 400 

Ala Ser Thr Ala Lys Ala Met Arg Asp Cys Val Leu Arg Gin Glu Asn 

405 410 415 

Leu lie Glu Ala Ser Ala Asp Glu Thr Leu Ala Trp Cys Lys Met Cys 
40 420 425 430 

He His His Asn Leu Pro Leu Arg Pro Gin Asp Pro He He Gly Thr 

435 440 445 

Ala Ala Ala Val Leu Glu Asn Leu Ala Thr Arg Leu Arg Pro Phe Leu 
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450 455 460 

Gin Cys Tyr Leu Lys Arg Leu Cys Gly Leu Asp Asp Leu Cys Ser Arg 
465 470 475 480 

Arg Arg Leu Ser Asp lie Lys Asp lie Ala Ser Phe Val Leu Val He 
5 485 490 495 

Leu Ala Arg Leu Ala Asn Arg Val Glu Arg Gly Val Ser Glu He Asp 

500 505 510 

Tyr Thr Thr Val Gly Val Gly Ala Gly Glu Thr Met His Phe Tyr He 
515 520 525 

10 Pro Gly Ala Cys Met Ala Gly Leu He Glu He Leu Asp Thr Gin Glu 
530 535 540 

Cys Ser Ser Arg Val Cys Glu Leu Thr Ala Ser His Thr He Ala Pro 
545 550 555 560 

Leu Tyr Val His Gly Lys Tyr Phe Tyr Cys Asn Ser Leu Phe 
15 565 570 

(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS : 
20 (A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Met Trp Gly Pro Gly Pro Ala Arg Phe He Ala Arg Pro Gly Thr His 
30 1 5 10 15 

Gly Arg Arg Val Phe Thr Asp Pro Pro Pro Arg Asn Met Thr Thr Thr 

20 25 30 

Pro Leu Ser Asn Leu Phe Leu Arg Ala Pro Asp He Thr His Val Ala 
35 40 45 

35 Pro Pro Tyr Cys Leu Asn Ala Thr Trp Gin Ala Glu Asn Ala Leu His 
50 55 60 

Thr Thr Lys Thr Asp Pro Ala Cys Leu Ala Ala Arg Ser Tyr Leu Val 
65 70 75 80 

Arg Ala Ser Cys Ser Thr Ser Gly Pro He His Cys Phe Phe Phe Ala 
40 85 90 95 

Val Tyr Lys Asp Ser Gin His Ser Leu Pro Leu Val Thr Glu Leu Arg 

100 105 110 

Asn Phe Ala Asp Leu Val Asn His Pro Pro Val Leu Arg Glu Leu Glu 
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115 120 125 

Asp Lys Arg Gly Gly Arg Leu Arg Cys Thr Gly Pro Phe Ser Cys Gly 

130 135.. 140 

Thr lie Lys Asp Val Ser Gly Asp Ala Gly Glu Tyr Thr lie Asn Gly 
5 145 150 155 160 

He Val Tyr His Cys His Cys Arg Tyr Pro Phe Ser Lys Thr Cys Trp 

165 170 175 

Leu Gly Ala Ser Ala Ala Leu Gin His Leu Arg Ser He Ser Ser Ser 
180 185 190 

10 Gly Thr Ala Ala Arg Ala Ala Glu Gin Arg Arg His Lys He Lys He 
195 200 205 

Lys He Lys Val 
210 

15 (2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Met He Gly Ala His Pro Gly Val Gly Gly Asp Leu Pro Ser Gly Leu. 

1 5 10 15 

Pro Thr Tyr Ala Glu Ala Thr Ser Asp Arg Pro Pro Thr Tyr Ala Met 
30 20 25 30 

Val Met Ala Ala Cys Pro Thr Glu Pro Pro Gly Gly Ser Val Gly Pro 

35 40 45 

Ala Asp Gin Pro Arg Val Gin Ser Ser Arg Thr Trp Arg Pro Pro Leu 
50 55 60 

35 Val Asn Ser Arg Glu Leu Tyr Arg Ala Gin Arg Ala Ala Arg Cys Ala 
65 70 75 80 

Ser Ser Ser Asp Thr Pro Gin Ala Pro Gly Trp Cys Gly Gly Thr Cys 

85 90 95 

Arg His Ala Val Phe Gly Val Val Ala Val Val Val Val He He Leu 
40 100 105 110 

Ala Phe Leu Trp Arg 
115 
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(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3699 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

GGGGGCCACG CGGCGGCGGG CCTGACGGAG CTGTGTCAGA CCCTCGCGCC CCGGGACCTC 60 

ACGGACCCGC TGCTGTTTGC GTACGTCGGA TTCCAGGTCG TGAACCACGG GCTGATGTTT 120 

GTGGTCCCCG ACATCGCCGT ATACGCGATG CTGGGGGGCG CCGTGTGGAT CTCGCTGACG 180 

15 CAGGTGCTTG GGCTCCGGCG CCGCCTTCAC AAGGACCCAG ACGCCGGGCC CTGGGCGGCC 240 

GCGACCCTGC GGGGCCTCTT TTTCTCCGTC TACGCATTGG GGTTTGCGGC GGGGGTGCTG 300 

GTGCGGCCGC GGATGGCGGC GAGCCGGCGG TCGGGGTGAT CGCCATTTCA AATAAAAGGC 360 

ACGAGTTCCC CGAATACCAC CGGCGTGTGA TGATTTCGCC CTACCGCTCC GATCCCCGGG 420 

GGGAGGGGGG AAGGAAATGG GGGCGGGGGT GCCGTGGACG GGTATAAAGG CCAGGGGGGC 4 80 

20 AGGCGGGCCC ATCACTGTTA GGGTGTTAGG TTGGGAGGTG GCACAAAAAG CGACACACCC 540 

GTGTTGTAGT TGTCCGCGGG AGGCGGTGGT TTCCGGCAAC CCTCCTCGCT GCGCCGGGCG 600 

CGCCCACCGG TCCTTCGCGG GGGCCGGGGC TCTTCTGGTC ATGGCCCTTG GACGGGTGGG 660 

CCTAGCCGTG GGCCTGTGGG GCCTGCTGTG GGTGGGTGTG GTCGTGGTGC TGGCCAATGC 720 

CTCCCCCGGA CGCACGATAA CGGTGGGCCC GCGGGGG AAC GCGAGCAATG CCGCCCCCTC 780 

25 CGCGTCCCCG CGGAACGCAT CCGCCCCCCG AACCACACCC ACGCCCCCCC AACCCCGCAA 840 

GGCGACGAAA AGTAAGGCCT CCACCGCCAA ACCGGCCCCG CCCCCCAAGA CCGGGCCCCC 900 

GAAGACATCC TCGGAGCCCG TGCGATGCAA CCGCCACGAC CCGCTGGCCC GGTACGGCTC 960 

GCGGGTGCAA ATCCGATGCC GGTTTCCCAA CTCCACCCGC ACGGAGTCCC GCCTCCAGAT 1020 

CTGGCGTTAT GCCACGGCGA CGGACGCCGA GATCGGAACG GCGCCTAGCT TAGAGGAGGT 1080 

30 GATGGTAAAC GTGTCGGCCC CGCCCGGGGG CCAACTGGTG TATGACAGCG CCCCCAACCG 1140 

AACGGACCCG CACGTGATCT GGGCGGAGGG CGCCGGCCCG GGCGCCAGCC CGCGGCTGTA 1200 

CTCGGTCGTC GGGCCGCTGG GTCGGCAGCG GCTCATCATC GAAGAGCTGA CCCTGGAGAC 1260 

CCAGGGCATG TACTACTGGG TGTGGGGCCG GACGGACCGC CCGTCCGCGT ACGGGACCTG 1320 

GGTGCGCGTT CGCGTGTTCC GCCCTCCGTC GCTGACCATC CACCCCCACG CGGTGCTGGA 1380 

35 GGGCCAGCCG TTTAAGGCGA CGTGCACGGC CGCCACCTAC TACCCGGGCA ACCGCGCGGA 1440 

GTTCGTCTGG TTCGAGGACG GTCGCCGGGT ATTCGATCCG GCCCAGATAC ACACGCAGAC 1500 

GCAGGAGAAC CCCGACGGCT TTTCCACCGT CTCCACCGTG ACCTCCGCGG CCGTCGGCGG 1560 

CCAGGGCCCC CCGCGCACCT TCACCTGCCA GCTGACGTGG CACCGCGACT CCGTGTCGTT 1620 

CTCTCGGCGC AACGCCAGCG GCACGGCATC GGTGCTGCCG CGGCCAACCA TTACCATGGA 1680 

40 GTTTACGGGC GACCATGCGG TCTGCACGGC CGGCTGTGTG CCCGAGGGGG TGACGTTTGC 1740 

CTGGTTCCTG . GGGGACGACT CCTCGCCGGC GGAGAAGGTG GCCGTCGCGT CCCAGACATC 1800 

GTGCGGGCGC CCCGGCACCG CCACGATCCG CTCCACCCTG CCGGTCTCGT ACGAGCAGAC 1860 

CGAGTACATC TGCCGGCTGG CGGGATACCC GGACGGAATT CCGGTCCTAG AGCACCACGG 1920 
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CAGCCACCAG 


CCCCCGCCGC 


GGGACCCCAC 


CGAGCGGCAG 


GTGATCCGGG 


CGGTGGAGGG 


1980 




GGCGGGGATC 


GGAGTGGCTG 


TCCTTGTCGC 


GGTGGTTCTG 


GCCGGGACCG 


CGGTAGTGTA 


2040 




CCTCACCCAC 


GCCTCCTCGG 


TGCGCTATCG 


TCGGCTGCGG 


TAACTCCGGG 


GCCGGGCCCG 


2100 




GCCGCCGGTT 


GTCTTCTTTT 


CCACCCCTTC 


CGTCCCCCGT 


ACCCACCACA 


CCCCACCCCA 


2160 


5 


CCCCCCCGCC 


GTCCCCCGGG 


CGTTATAAGC 


CGCCGCACTC 


GCTTTTCCCA 


CCGGAAAATC 


2220 




CTCGGCCCGA 


TCCGAACGGC 


GCACGCCGCG 


TGGGCTCCAA ACGCCTCCGG 


AAGAGAGCGC 


2280 




CCCGCCCCGA 


TATTCAAGCC 


CGCGGTGGTG 


CTATGGCTTT 


CCGTGCTTCG 


GGACCCGCCT 


2340 




ACCAGCCCCT 


CGCCCCCGCG 


GCCTCCCCGG 


CGCGGGCTCG 


TGTTCCGGCC 


GTGGCCTGGA 


2400 




TCGGCGTCGG 


AGCGATCGTC 


GGGGCCTTTG 


CGCTCGTCGC 


CGCGTTGGTT 


CTCGTACCCC 


2460 


10 


CTCGGTCCTC 


GTGGGGACTC 


TCGCCGTGCG 


ACAGCGGCTG 


GCAGGAATTC AACGCGGGAT 


2520 




GCGTCGCGTG 


GGACCCCACC 


CCCGTCGAGC 


ACGAGCAGGC 


GGTCGGCGGC 


TGCAGCGCGC 


2580 




CGGCCACCCT 


TATCCCCCGT 


GCGGCCGCCA 


AGCACCTGGC 


CGCTCTGACA 


CGCGTCCAGG 


2640 




CGGAGAGATC 


GTCGGGTTAC 


TGGTGGGTGA 


ACGGAGACGG 


CATCCGGACC 


TGTCTGAGAC 


2700 




TCGTCGACAG 


CGTCAGTGGC 


ATCGACGAGT 


TTTGCGAGGA 


GCTCGCGATC 


CGCATATGCT 


2760 


15 


ACTACCCACG 


AAGCCCCGGC 


GGGTTTGTCC 


GCTTCGTAAC 


TTCGATACGT 


AACGCCCTGG 


2820 




GGTTGCCGTG 


AGGCGCGCGT 


CCGACGGTCC 


CGCTTCTCGC 


CTCTCTTCTT 


CCCCCTCCCC 


2880 




ACCCCACCCA 


CCGACCAACG 


ACGGCGTTTG 


GCCAATACCC 


TCCTTTTTTC 


TTTTTCTCTT 


2940 




CCCCCCCCAA 


AAAAAAAAAC 


AATAAACAGC 


TAATTGCGTA 


CGACAAACCA 


TGCGGAACTC 


3000 




GCTGTTTTTT 


TTTCTCTGTT 


TGTTACTTTT 


TATTGAAAAC 


AGACATACGG 


GGAAAGGGGC 


3060 


20 


CGGAAACCGA 


GACGGTGGGG 


CCGGCGGTCG 


CATTTTTTTA ATGGCTCTGG 


TGTCGGCCGC 


3120 




GTTTGAGCTT 


CGTCAACAGG 


GCGCTGAGGG 


CGGCGACGTT 


TGTCGGGCCG 


TCGTTGGCCA 


3180 




GCGCGTTGGT 


CCGGGGGCGG 


GCGGGCATGG 


GCGACAGGCT 


TAGTCCCGGG 


TCCGGGGCGC 


3240 




GTGTGGCCCC 


CGGAGGGGAG 


AAGAGGGCAG 


ACCCGCCCCA 


GTCGTACAGG 


GGATTTTCCG 


3300 




CCTCGATGTA 


CGGGGAGTCC 


GGGGCGTCTC 


CCGGCGGGGC 


CGCCCCGCCG 


GCGTCTTGCC 


3360 


25 


GGCGAAGGCA 


GATGTTTTCG 


TATACCCGAA 


CCCAGGGGAT 


CTCCTCGTAG 


ACGCGCCCCC 


3420 




CATCCTCGCT 


CACCGACTCG 


TAAATGGAAT 


CTGCGTCCTC 


GGAGGGGGCG 


CGGGGGGCGT 


3480 




GGCTTTCGGC 


CGGCCAGGCG 


GCGGCGGTGG 


TGTCGGCGGC 


GGGGGTGGCG 


CCAAGCCCGA 


3540 




CGCCCGCGGG 


CATGGCGGCG 


TCATCGTCGG 


GCAGCAGATA 


CGTGTTTTCC 


ATCTGGTCCG 


3600 




GTTCGGCCTC 


CGCGTCTGGC 


CCCCAGGTCC 


GCACCGCGTC 


GTAAACCCCG 


GCGGCCTCGC 


3660 


30 


GCTGAGCCGC 


GAGCGGGCGC 


GCCGCGGCTG 


CCGGCCGC 






3699 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 
35 {A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 
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Xaa Xaa Xaa Xaa Xaa Gly Gly His Ala Ala Ala Gly Leu Thr Glu Leu 

15 10 15 

Cys Gin Thr Leu Ala Pro Arg- Asp Leu Thr Asp Pro Leu Leu Phe Ala 
20 25 ' 30 

5 Tyr Val Gly Phe Gin Val Val Asn His Gly Leu Met Phe Val Val Pro 
35 40 45 

Asp lie Ala Val Tyr Ala Met Leu Gly Gly Ala Val Trp He Ser Leu 

50 55 60 

Thr Gin Val Leu Gly Leu Arg Arg Arg Leu His Lys Asp Pro Asp Ala 
10 65 70 75 80 

Gly Pro Trp Ala Ala Ala Thr Leu Arg Gly Leu Phe Phe Ser Val Tyr 

85 90 95 

Ala Leu Gly Phe Ala Ala Gly Val Leu Val Arg Pro Arg Met Ala Ala 
100 105 110 

15 Ser Arg Arg Ser Gly 
115 

(2) INFORMATION FOR SEQ ID NO: 197: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

30 Met Gly Ala Gly Val Pro Trp Thr Gly He Lys Arg Ala Gly Gly Pro 
15 10 15 

He Thr Val Arg Val Leu Gly Trp Glu Val Ala Gin Lys Ala Thr His 

20 25 30 

Pro Cys Cys Ser Cys Pro Arg Glu Ala Val Val Ser Gly Asn Pro Pro 
35 35 40 45 

Arg Cys Ala Gly Arg Ala His Arg Ser Phe Ala Gly Ala Gly Ala Leu 

50 55 60 

Leu Val Met Ala Leu Gly Arg Val Gly Leu Ala Val Gly Leu Trp Gly 
65 70 75 80 

40 Leu Leu Trp Val Gly Val Val Val Val Leu Ala Asn Asp Gly Arg Thr 

85 90 95 

He Thr Val Gly Pro Arg Gly Asn Asn Ala Ala Pro Ser Asp Arg Asn 
100 105 110 
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Ala Ser Ala Pro Arg Thr Thr Pro Thr Pro Pro Gin Pro Arg Lys Ala 

115 120 125 

Thr Lys Ser Lys Ala Ser Thr. Ala Lys Pro Ala Pro Pro Pro Lys Thr 
130 135 140 

5 Gly Pro Pro Lys Thr Ser Ser Glu Pro Val Arg Cys Asn Arg His Asp 
145 150 155 160 

Pro Leu Ala Arg Tyr Gly Ser Arg Val Gin lie Arg Cys Arg Phe Pro 

165 170 175 

Asn Ser Thr Arg Thr Glu Ser Arg Leu Gin lie Trp Arg Tyr Ala Thr 
10 180 185 190 

Ala Thr Asp Ala Glu lie Gly Thr Ala Pro Ser Leu Glu Glu Val Met 

195 200 205 

Val Asn Val Ser Ala Pro Pro Gly Gly Gin Leu Val Tyr Asp Ser Ala 
210 215. 220 

15 Pro Asn Arg Thr Asp Pro His Val lie Trp Ala Glu Gly Ala Gly Pro 
225 230 235- 240 

Gly Asp Arg Lys Val Val Gly Pro Leu Gly Arg Gin Arg Leu lie lie 

245 250 255 

Glu Glu Leu Thr Leu Glu Thr Gin Gly Met Tyr Tyr Trp Val Trp Gly 
20 260 265 270 

Arg Thr Asp Arg Pro Ser Ala Tyr Gly Thr Trp Val Arg Val Arg Val 

275 280 285, 

Phe Arg Pro Pro Ser Leu Thr lie His Pro His Ala Val Leu Glu Gly 
290 295 300 

25 Gin Pro Phe Lys Ala Thr Cys Thr Ala Ala Thr Tyr Tyr Pro Gly Asn 
305 310 315 320 

Arg Ala Glu Phe Val Trp Phe Glu Asp Gly Arg Arg Val Phe Asp Pro 

325 330 335 

Ala Gin lie His Thr Gin Thr Gin Glu Asn Pro Asp Gly Phe Ser Thr 
30 340 345 350 

Val Ser Thr Val Thr Ser Ala Ala Val Gly Gly Gin Gly Pro Pro Arg 

355 360 365 

Thr Phe Thr Cys Gin Leu Thr Trp His Arg Asp Ser Val Ser Phe Ser 
370 375 380 

35 Arg Arg Asn Ala Ser Gly Thr Ala Ser Val Leu Pro Arg Pro Thr lie 
385 390 395 400 

Thr Met Glu Phe Thr Gly Asp His Ala Val Cys Thr Ala Gly Cys Val 

405 410 415 

Pro Glu Gly Val Thr Phe Ala Trp Phe Leu Gly Asp Asp Ser Ser Pro 
40 420 425 430 

Ala Glu Lys Val Ala Val Ala Ser Gin Thr Ser Cys Gly Arg Pro Gly 

435 440 445 

Thr Ala Thr lie Arg Ser Thr Leu Pro Val Ser Tyr Glu Gin Thr Glu 
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450 455 460 

Tyr lie Cys Arg Leu Ala Gly Tyr Pro Asp Gly lie Pro Val Leu Glu 
465 470 475 480 

His His Gly Ser His Gin Pro Pro Pro Arg Asp Pro Thr Glu Arg Gin 
5 485 490 495 

Val lie Arg Ala Val Glu Gly Ala Gly He Gly Val Ala Val Leu Val 

500 505 510 

Ala Val Val Leu Ala Gly Thr Ala Val Val Tyr Leu Thr His Ala Ser 
515 520 525 

10 Ser Val Arg Tyr Arg Arg Leu Arg 
530 535 



(2) INFORMATION FOR SEQ ID NO: 198: 



15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 



25 Val Gly Ser Lys Arg Leu Arg Lys Arg Ala Pro Arg Pro Asp He Gin 
15 10 15 

Arg Gly Ala Met Ala Phe Arg Ala Ser Gly Pro Ala Tyr Gin Pro Leu 

20 25 30 

Ala Pro Ala Asp Ala Arg Ala Arg Val Pro Ala Val Ala Trp He Gly 
30 35 40 45 

Val Gly Ala He Val Gly Ala Phe Ala Leu Val Ala Ala Leu Val Leu 

50 55 60 

Val Pro Pro Arg Ser Ser Trp Gly Leu Ser Pro Cys Asp Ser Gly Trp 
65 70 75 80 

35 Gin Glu Phe Asn Ala Gly Cys Val Ala Trp Asp Pro Thr Pro Val Glu 

85 90 95 

His Glu Gin Ala Val Gly Gly Cys Ser Ala Pro Ala Thr Leu He Pro 

100 105 110 

Arg Ala Ala Ala Lys His Leu Ala Ala Leu Thr Arg Val Gin Ala Glu 
40 115 120 125 

Arg Ser Ser Gly Tyr Trp Trp Val Asn Gly Asp Gly He Arg Thr Cys 

130 135 . 140 

Leu Arg Leu Val Asp Ser Val Ser Gly He Asp Glu Phe Cys Glu Glu 
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145 150 155 160 

Leu Ala lie Arg lie Cys Tyr Tyr Pro Arg Ser Pro Gly Gly Phe Val 

165 - 170 175 

Arg Phe Val Thr Ser lie Arg Asn Ala Leu Gly Leu Pro 
5 180 185 

(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 198 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:199: 

Gin Arg Pro Ala Ala Ala Ala Arg Pro Leu Ala Ala Gin Arg Glu Ala 
20 1 5 10 15 

Ala Gly Val Tyr Asp Ala Val Arg Thr Trp Gly Pro Asp Ala Glu Ala 

20 25 . 30 

Glu Pro Asp Gin Met Glu Asn Thr Tyr Leu Leu Pro Asp Asp Asp Ala 
35 40 45 

25 Ala Met Pro Ala Gly Val Gly Leu Gly Ala Thr Pro Ala Ala Asp Thr 
50 55 60 

Thr Ala Ala Ala Trp Pro Ala Glu Ser His Ala Pro Arg Ala Pro Ser 
65 70 75 80 

Glu Asp Ala Asp Ser lie Tyr Glu Ser Val Ser Glu Asp Gly Gly Arg 
30 85 90 95 

Val Tyr Glu Glu lie Pro Trp Val Arg Val Tyr Glu Asn lie Cys Leu 

100 105 110 

Arg Arg Gin Asp Ala Gly Gly Ala Ala Pro Pro Gly Asp Ala Pro Asp 
115 120 125 

35 Ser Pro Tyr lie Glu Ala Glu Asn Pro Leu Tyr Asp Trp Gly Gly Ser. 
130 135 140 

Ala Leu Phe Ser Pro Pro Gly Ala Thr Arg Ala Pro Asp Pro Gly Leu 
145 150 155 160 

Ser Leu Ser Pro Met Pro Ala Arg Pro Arg Thr Asn Ala Asn Asp Gly 
40 165 170 175 

Pro Thr Asn Val Ala Ala Leu Ser Ala Leu Leu Thr Lys Leu Lys Arg 

180 185 190 

Gly Arg His Gin Ser His 
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195'. . 

(2) INFORMATION FOR SEQ ID NO:200: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

CTGTGTGAAA TTGTTATCCG CTCACAATTC CACACAACAT ACGAGCCGGA AGCATAAAGT 60 
15 GTAAAGCCTG GGGTGCCTAA TGAGTGAGCT AACTCACATT AATTGCGTTG CGCTCACTGC 120 
CCGCTTTCCA GTCGGGAAAC CTGTCGTGCC A 152 

(2) INFORMATION FOR SEQ ID NO: 201: 

20 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

CCCGGAGCCC GGCGGCCGCA GCCGAGCAGC GCCGCGGGCT CCGGGGCCGG GCCGGGCCGG 60 
30 CAACGCCCCG CGCCGGCCGC GGCGGTGAGA ACCCCTGTGT CATTGTTTAC GTGGCCGCGG 120 
GCCAGCAG 129 

(2) INFORMATION FOR SEQ ID NO: 2 02: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

40 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:202: 
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GGTGGGGCGC GGCGGCCGGC TCGGGGTGGG GGGAGAGTGT CGTGGGTGTG TTTTCGTGTC 60 
CCCCACCACC ACTCCCACCC CGACCGCCGC CGCGCCCGCG TTTCTGCCGC CCGCGCGCTC 120 
CTGTGT - 127 

5 (2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203: 

ATGCACGTGT AACCGCCAGT CCGTGCTTGC CTAGCGAACT CACCCGTCCC GGCTGGCGTG 60 
CGCAGCCCGG GCCGTGTTGC GGGCCCTCTT AAGGGGCGGC GGCAGGACGG GGACTCCGCC 120 
CCGCCTCCTT TCCCCCGGGG AGTCAACCCC CGGGGG 157 

20 (2) INFORMATION FOR SEQ ID NO: 2 04: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16813 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 



GGGGGGACGG GACGGGGGGA CGGGACGGGG GGGCCCCGAT CCCAACATCC GCGCTTTCTC 60 

GCAGGCCGGG CGCCGCCTTC GTGGACGGGA CACCGGTGTG GTAACTGGCG ACAAGGCGTC 120 

GCCACTATGG CAGACATCCC CCCGGACCCG CCCGCGCTCA ACACGACGCC TGCGAATCAT 180 

GCTCCCCCAT CCCCACCCCC GGGTTCACGG AAGCGCAGAC GCCCCGTCCT CCCCAGCTCG 240 

35 TCGGAATCTG AGGGTAAGCC CGACACAGAA TCGGAATCCT CCTCGACCGA GTCGTCCGAG 300 

GATGAGGCGG GAGACCTACG CGGCGGGCGC CGTCGCTCCC CGCGGGAGCT CGGGGGGAGG 360 

TATTTTTTGG ATCTGTCGGC AGAATCGACC ACGGGGACGG AATCGGAGGG AACGGGGCCG 420 

TCGGACGACG ATGATGATGA TGCGTCAGAC GGCTGGTTGG TTGACACCCC CCCCCGTAAA 480 

TCCAAGCGAC CCCGAATCAA CCTGCGATTA ACGAGCTCCC CCGACCGGCG CGCGGGTGTG 540 

40 GTTTTCCCCG AGGTGTGGAG AAACGACAGA CCTATCCGCG CGGCGCAACC CCAGGCCCCG 600 

GCCCAGTCTT CCGGGGATCG CGCAGCCGCA CCGCGGCGCT CTGCTCGCCA GGCCCAGATG 660 

CGGAGCGGAG CCGCCTGGAC GCTTGATCTG CATTACATAC GCCAGTGCGT CAACCAGCTC 720 

TTTCGGATCC TGCGTGCCGC CCCGAACCCG CCCGGCAGCG CCAACCGCCT GCGCCACCTG 780 
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GTGCGAGACT GCTACCTTAT GGGCTACTGC 
CGCCTGCTGC AGATCTCGGG CGGAACCTGG 
GTCGAGGCGC GTTTTGAACC CGCCGGCGAG 
AGGCGTTACG GCCCCGAGTG TGATGTTGGC 
5 GATGATGAGA TATCGGATGC GACGGACTCG 
GAGGGGGGGC CCTCCCCGGC CGGCCGGGAG 
GCGGCTCGGC TGGAGTGTGA GTTTGGGACG 
CCCTGGCTGT CCGCGGTGGT CGCCGATACC 
CCGGGCGCGT GTCGCGCAAC GGAAGCCCCA 

10 TTCCCCGCCG CCTGCCCCTA TCCCTGCGGC 
CCCCAGCCCG GTGTGTTTGC CAAACGAAAA 
GAGTGTCTGG TTTTTCTGGG GGTGGAGGAA 
CCGCTCGTAC GTGTAATGGG GCGCAGTGTT 
TGGTGATTGG ATAGCAAACG TGGGATGACG 

15 GCCCGGGGTT CTGGGGGTCA TCGTCCCCCG 
GTCGCCCCAT GCACGTAAAA CACGGGCGCT 
ATGATGCGGG GCGGGGTTTG TTGTGAGGAG 
TGCGTTCCCG GTCGCCGGGC ACCACCACGC 
TTTCCAGGGG GTGATTAGGC GCTGCGGGGA 

20 AAAACCGATC GGGCAGGGGC CACGGTTCCC 
TGAAGCGAAA CCCCAGCCGG GTTTTTTGTG 
CCGCCCCGAC CACCCACAGG TGGTATAGCC 
GCCGAAAACA CGCAGGGGCA TCCAGAATCT 
ACAGGCCCGT CGTGGTGTTT GGGGGACAGC 

25 TGAATTTGGG CAAGTCCATC AGAGGCTCGG 
CCACCGGGGT TCCCAGAGGC TCGGAGGCCA 
ACAGAGCCGG GCTTGCGTCA GCGGAAGTCC 
TAACCACACT TACAACAACA ACGCCCATGT 
CTCACTCGCC TGTCTGCGGA CCTATGCACG 

30 CTTGCTATCA CACGGCCCGT TCGCACGTTC 
TATCCCGGAT AATCTGACGT TCCGGATATA 
CAAACTGCCG CTTCTTAAAA CACCGGGGCC 
CGCGACGCGG CGAATGGCCT GTCGTAAGTT 
ACAGGAGGCG TCCGTCCCGC CGGAGACAAA 

35 TTATACCCCC GCGGAGGATG CGTACCTGGC 
CCGCCCACCG TCCCCCGGCG AGGCTGCGCG 
GATGCACAGC GACGAGGACT ACCCCATCGT 
GGCCGACGAT GACGCCCCGG ATGACGTGGC 
TCTGTCCATG GTTTCGGCCG CCCCCCTGCC 

40 CCGCGCAGCC CCCCCCGACG TCCGGACCTG 
CACCCCGGAA GAGCTCGACA CCATGGACCG 
CAAGCCCCCT TCGACCCTGG CAAAACTGGT 
GCTCATCCCG GGGTCGGAGG GGTGTGTCTT 



CGGACCCGCC 


TGGGGCCGCG 


CACGTGGGGC 


840 


GACGTGCGCC 


TGCGAAACGC 


AATCCGGGAG 


900 


CCCGTGTGGG 


AGCTGCCCTG 


TCTGAACGCC 


960 


AATCTCGAGA 


CCAACGGCGG 


CTCGACGAGC 


1020 


GACGATACCC 


TCGCGTCCCA 


TTCCGACACG 


1080 


AACCCGGAAT 


CCGCGTCCGG 


CGGGGCTATC 


1140 


TTTGACTGGA 


CGTCCGAGGA 


GGGCTCCCAG 


1200 


AGCTCCGCCG 


AACGCTCTGG 


CCTACCCGCC 


1260 


GAACGCGAGG 


ACGGGTGCCG 


AAAAATGCGC 


1320 


CACACATTTC 


TCCGGCCATG 


AGCGCGGGAC 


1380 


TAAACGCCCT 


ACAAGAAAGC 


TTTTGTGTCT 


1440 


GGAACGACAA 


AAAAAGAAAC 


AAACGCGACA 


1500 


TTTTATTAGC 


ATCGGGGGGG 


GGTTAGAGGT 


1560 


GAGGCCACTC 


GTCGCCAACG 


GCCAGCGGGG 


1620 


TCTGCCAGGA 


GGGCTCATCG 


GGAATCTCGG 


1680 


GCGTGGGGTG 


GGTCGCCGGA 


TGCGGGCGGG 


1740 


CCACGAGGGA 


CCGTAGCCAG 


CGAAGACAGC 


1800 


CGTATTGGTA 


TTCGTATCGG 


CTAAGGAGAT 


1860 


ACGGGGTCCA 


CGACACGGTC 


CGCTCGGGCA 


1920 


CCACCCACGC 


GTCGTTGGTC 


TTCATGGCGA 


1980 


CGTACTCTAA 


AAACGGCACA 


CACAGGTCCG 


2040 


GGTGGGGGCC 


GGGGCGCTCT 


TGATGCAGGA 


2100 


CGATGCTTTC 


CAGGGGGTCG 


TCCTCCGCAA 


2160 


GACAGGAGCG 


GGTTCGCACG 


ATCGGTCGGG 


2220 


CCAGCCTGCG 


AAGGTTCGCC 


GGGCGAACCA 


2280 


GGATCCGGCA 


TTGCCGAAGC 


AGAAAACTCC 


2340 


GCGGCAGGGC 


GTTTCGTTGG 


TCTAGGAGGG 


2400 


CGGTATATTA 


GGCCCGTGGT 


CCGATCTTCA 


2460 


GCGGGACGGC 


GCGCGGACCC 


GGGGGGGCTG 


2520 


GATTTTTTCA 


GCCTTGTTTG 


GTTGGCTAGG 


2580 


GGGGGCGGGG 


GTAGTGGGGG 


GGTGTGTCGA 


2640 


CGTCGCTCGG 


GGTGCTCGTT 


GGTTGGCACG 


2700 


CTGTGGGGTC 


TACCGTAGAC 


CCGACAAGAG 


2760 


CACGGCCCCG 


GCCTTCCCGG 


CGAGCACCTT 


2820 


CCCCGGGCCC 


CCGGAAACCA 


TCCACCCTTC 


2880 


CCTGTGTCAG 


CTGCAGGAGA 


TCTTGGCCCA 


2940 


GGACGCCGCG 


GGTGCGGAGG 


AGGAAGACGA 


3000 


CTACCCGGAG 


GACTACGCGG 


AGGGGCGTTT 


3060 


CGGAGCCAGC 


GGCCATCCTC 


CTGTTCCGGG 


3120 


CGACAGCGGT 


AAGGTGGGGG 


CCACGGGGTT 


3180 


GGAGGCACTT 


CGGGCCATCA 


GCCGCGGGTG 


3240 


GACCGGGCTG 


GGATTCGCGA 


TCCACGGAGC 


3300 


TGATAGCAGC 


CACCCGAACT 


ACCCTCATCG 
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WO 98/20016 



PCTYUS97/20016 



GGTAATCGTC AAGGCGGGGT GGTACGCCAG CACGAACCAC GAGGCGCGGC TGCTGAGACG 3420 

CCTGAACCAC CCCGCGATCC TACCCCTCCT GGACCTGCAC GTCGTTTCTG GGGTCACGTG 3480 

TCTGGTCCTC CCCAAGTATC ACTGCGACCT GTATACCTAT CTGAGCAAGC GCCCGTCTCC 3540 

GTTGGGCCAC CTACAGATAA CCGCGGTCTC CCGGCAGCTC TTGAGCGCCA TCGACTACGT 3600 

5 CCACTGCGAA GGCATCATCC ACCGCGATAT TAAGACCGAG AACATCCTCA TCAACACCCC 3660 

CGAGAACATC TGTCTGGGGG ACTTTGGGGC GGCGTGCTTT GTGCGCGGGT GTCGATCGAG 3720 

CCCCTTCCAT TACGGGATCG CAGGCACCAT CGATACAAAC GCCCCCGAGG TCCTGGCCGG 3780 

GGATCCGTAC ACCCAGGTAA TCGACATCTG GAGCGCCGGC CTGGTGATCT TTGAGACCGC 3840. 

CGTCCACACC GCGTCCTTGT TCTCGGCCCC GCGCGACCCC GAAAGGCGGC CGTGCGACAA 3900 

10 CCAGATCGCG CGCATCATCC GACAGGCCCA GGTACACGTC GACGAGTTTC CAACGCACGC 3960 

GGAATCGCGC CTCACCGCGC ACTACCGCTC GCGGGCGGCC GGGAACAATC GTCCGGCGTG 4020 

GACCCGACCG GCATGGACCC GCTACTACAA GATCCACACA GACGTCGAAT ATCTCATCTG 4080 

CAAAGCCCTT ACCTTTGACG CGGCGCTCCG CCCAAGCGCC GCGGAGTTGC TGCGCCTGCC 4140 

GCTATTTCAC CCTAAGTGAC CCCGCTCCCC CCGGGGGGCG TGGAGGGGGG GCTGGTTGGA 4200 

15 TGTTTTTGCA CAAAAAGACG CGGCCCTCGG GCTTTGGTGT TTTTGGCACC TTGCCGCCCG 4260 

GCGTCATGCA CGCCATCGCT CCCAGGTTGC TTCTTCTTTT TGTTCTTTCT GGTCTTCCGG 4320 

GGACACGCGG CGGGTCGGGT GTCCCCGGAC CAATTAATCC CCCCAACAAC GATGTTGTTT 4380 

TCCCGGGAGG TTCCCCCGTG GCTCAATATT GTTATGCCTA TCCCCGGTTG GACGATCCCG 4440 

GGCCCTTGGG TTCCGCGGAC GCCGGGCGGC AAGACCTGCC CCGGCGCGTC GTCCGTCACG 4500 

20 AGCCCCTGGG CCGCTCGTTC CTCACGGGGG GGCTGGTTTT GCTGGCGCCG CCGGTACGCG 4560 

GATTTGGCGC ACCCAACGCA ACGTATGCGG CCCGTGTGAC GTACTACCGG CTCACCCGCG 4620 

CCTGCCGTCA GCCCATCCTC CTTCGGCAGT ATGGAGGGTG TCGCGGCGGC GAGCCGCCGT 4680 

CCCCAAAGAC GTGCGGGTCG TACACGTACA CGTACCAGGG CGGCGGGCCT CCGACCCGGT 4740 

ACGCTCTCGT AAATGCTTCC CTGCTGGTGC CGATCTGGGA CCGCGCCGCG GAGACATTCG 4800 

25 AGTACCAGAT CGAACTCGGC GGCGAGCTGC ACGTGGGTCT GTTGTGGGTA GAGGTGGGCG 4860 

GGGAGGGCCC CGGCCCCACC GCCCCCCCAC AGGCGGCGCG TGCGGAGGGC GGCCCGTGCG 4920 

TCCCCCCGGT CCCCGCGGGC CGCCCGTGGC GCTCGGTGCC CCCGGTATGG TATTCCGCCC 4980 

CCAACCCCGG GTTTCGTGGC CTGCGTTTCC GGGAGCGCTG TCTGCCCCCA CAGACGCCCG 5040 

CCGCCCCCAG CGACCTACCA CGCGTCGCTT TTGCTCCCCA GAGCCTGCTG GTGGGGATTA 5100 

30 CGGGCCGCAC GTTTATTCGG ATGGCACGAC CCACGGAAGA CGTCGGGGTC CTGCCACCCC 5160 

ATTGGGCCCC CGGGGCCCTA GATGACGGTC CGTACGCCCC CTTCCCACCC CGCCCGCGGT 5220 

TTCGACGCGC CCTGCGGACA GACCCCGAGG GGGTCGACCC CGACGTTCGG GCCCCCCTAA 5280 

CCGGGCGGCG CCTCATGGCC TTGACCGAGG ACGCGTCCTC CGATTCGCCT ACGTCCGCTC 5340 

CGGAGAAGAC GCCCCTCCCT GTGTCGGCCA CCGCCATGGC GCCCTCAGTC GACCCAAGCG 5400 

35 CGGAACCGAC CGCCCCCGCA ACCACTACTC CCCCCGACGA GATGGCCACA CAAGCCGCAA 5460 

CGGTCGCCGT TACGCCGGAG GAAACGGCAG TCGCCTCCCC GCCCGCGACT GCATCCGTGG 5520 

AGTCGTCGCC ACTCCCCGCC GCGGCGGCAA CGCCCGGGGC CGGGCACACG AACACCAGCA 5580 

GCGCCCCCGC AGCGAAAACG CCCCCCACCA CACCAGCCCC CACGACCCCC CCGCCCACGT 5640 

CTACCCACGC GACCCCCCGC CCCACGAGTC CGGGGCCCCA AACAACCCCT CCCGGACCCG 5700 

40 CAACCCCGGG TCCGGTGGGC GCCTCCGCCG CACCCACGGC CGATTCCCCC CTCACCGCCT 5760 

CGCCCCCCGC TACCGCGCCG GGGCCCTCGG CCGCCAACGT TTCGGTCGCC GCGACCACCG 5820 

CCACGCCCGG AACCCGGGGC ACCGCCCGTA CCCCCCCAAC GGACCCAAAG ACGCAGCCAC 5880 

ACGGACCCGC GGACGCTCCC CCCGGCTCGC CAGCCCCCCC ACCCCCCGAA CATCGCGGCG 5940 
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GACCCGAGGA GTTTGAGGGC GCCGGGGACG GCGAACCCCC CGATGACGAC GACAGCGCCA 6000 

CCGGCCTCGC CTTCCGAACT CCGAACCCCA ACAAACCACC CCCCGCGCGC CCCGGGCCCA 6060 

TCCGCCCCAC GCTCCCGCCA GGAATTCTTG GGCCGCTCGC CCCCAACACG CCTCGCCCCC 6120 

CCGCCCAAGC TCCCGCTAAG GACATGCCCT CGGGCCCCAC ACCCCAACAC ATCCCCCTGT 6180 

5 TCTGGTTCCT AACGGCCTCC CCTGCTCTAG ATATCCTCTT TATCATCAGC ACCACCATCC 6240 

ACACGGCGGC GTTCGTTTGT CTGGTCGCCT TGGCAGCACA ACTTTGGCGC GGCCGGGCGG 6300 

GGCGCAGGCG ATACGCGCAC CCGAGCGTGC GTTACGTATG TCTGCCACCC GAGCGGGATT 6360 

AGGGGGTGGG GTGGGGGCGA GAAACGATGA AGGACGGGAA AGGGAACAGC GACCAAATGC 6420 

CACGATAAGA ACAATAAACC TGTGACGTCA ATCGGATATG TGAGTTTGGT TGTGTTTTGT 6480 

10 GGGACTGGGG GCGGGGGGTG GGAGGTATCA GTGGGTGACA GAGTCTTTTA AAAGACGTGT 6540 

CCCGGGGCCC TCGAGACGCG CAACTTTTGG CCACACAGAG AAAGGCCCCC AGACGAAGTC 6600 

ACCCGGGTCC CCGAACAAAA ACAAAAACCT TGACCGCCGC CGGGGGGCGT GCCTGTTGTT 6660 

TTGGTCTCAA TGGATCGGTA TGCCGTTCGG ACCTGGGGGA TTGTGGGAAT CCTCGGGTGT 6720 

GCTGCTGTTG GGGCCGCACC CACCGGCCCC GCGTCCGATA CAACAAACGC GACCGCACGC 6780 

15 CTCCCCACGC ACCCCCCACT CATCCGTTCC GGGGGCTTTG CCGTCCCCCT CATCGTGGGG 6840 

GGGCTGTGTC TCATGATTCT GGGGATGGCG TGTCTACTCG AGGTCCTGCG TCGCCTGGGT 6900 

CGCGAGTTGG CGAGGTGCTG CCCCCACGCG GGCCAATTTG CCCCATGATT TTTCGCCTTT 6960 

CTGGCCTTGC CCCCACCCCA TCGCCCCGAT TGTGTGTCGG GTGCCCGGGG TACAGCAGCT 7020 

ATGGAGCGGT CGGTAATATA ACTTTGGTTG TCGCCACACG CCCCGTGCCG GGCATGGGTT 7080 

20 GTGCGGGAAA GACGAAATAA TCCGGCGATC CCCAAGCGTA CCAACTTGGG GGGGGGGGGA 7140 

AAGAAACTAA AAACACATCA AGCCCACAAC CCATCCCACA AGGGGGGTTA TGGCGGACCC 7200 

ACCGCACCAC CATACTCCGA TTCGACCACA TATGCAACCA AATCACCCCC AGAGGGGAGG 7260 

TTCCATTTTT ACGAGGAGGA GGAGTATAAT AGAGTCTTTG TGTTTAAAAC CCGGGGTCGG 7320 

TGTGGTGTTC GGTCATAAGC TGCATTGCGA ACGACTAGTC GCCGTTTTTC GTGTGCATCG 7380 

25 CGTATCACGG CATGGGGCGT TTGACCTCCG GCGTCGGGAC GGCGGCCCTG CTAGTTGTCG 7440 

CGGTGGGACT CCGCGTCGTC TGCGCCAAAT ACGCCTTAGC AGACCCCTCG CTTAAGATGG 7500 

CCGATCCCAA TCGATTTCGC GGGAAGAACC TTCCGGTTTT GGACCAGCTG ACCGACCCCC 7560 

CCGGGGTGAA GCGTGTTTAC CACATTCAGC CGAGCCTGGA GGACCCGTTC CAGCCCCCCA 7620 

GCATCCCGAT CACTGTGTAC TACGCAGTGC TGGAACGTGC CTGCCGCAGC GTGCTCCTAC 7680 

30 ATGCCCCATC GGAGGCCCCC CAGATCGTGC GCGGGGCTTC GGACGAGGCC CGAAAGCACA 7740 

CCTACAACCT GACCATCGCC TGGTATCGCA TGGGAGACAA TTGCGCTATC CCCATCACGG 7800 

TTATGGAATA CACCGAGTGC CCCTACAACA AGTCGTTGGG GGTCTGCCCC ATCCGAACGC 7860 

AGCCCCGCTG GAGCTACTAT GACAGCTTTA GCGCCGTCAG CGAGGATAAC CTGGGATTCC 7920 

TGATGCACGC CCCCGCCTTC GAGACCGCGG GTACGTACCT GCGGCTAGTG AAGATAAACG 7980 

35 ACTGGACGGA GATCACACAA TTTATCCTGG AGCACCGGGC CCGCGCCTCC TGCAAGTACG 8040 

CTCTCCCCCT GCGCATCCCC CCGGCAGCGT GCCTCACCTC GAAGGCCTAC CAACAGGGCG 8100 

TGACGGTCGA CAGCATCGGG ATGCTCCCCC GCTTTATCCC CGAAAACCAG CGCACCGTCG 8160 

CCCTATACAG CTTAAAAATC GCCGGGTGGC ACGGCCCCAA GCCCCCGTAC ACCAGCACCC 8220 

TGCTGCCGCC GGAGCTGTCC GACACCACCA ACGCCACGCA ACCCGAACTC GTTCCGGAAG 8280 

40 ACCCCGAGGA CTCGGCCCTC TTAGAGGATC CCGCCGGGAC GGTGTCTTCG CAGATCCCCC 8340 

CAAACTGGCA CATCCCGTCG ATCCAGGACG TCGCGCCGCA CCACGCCCCC GCCGCCCCCA 8400 

GCAACCCGGG CCTGATCATC GGCGCGCTGG CCGGCAGTAC CCTGGCGGTG CTGGTCATCG 8460 

GCGGTATTGC GTTTTGGGTA CGCCGCCGCG CTCAGATGGC CCCCAAGCGC CTACGTCTCC 8520 
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CCCACATCCG GGATGACGAC GCGCCCCCCT CGCACCAGCC ATTGTTTTAC TAGAGGAGTA 8580 

TCCCCGCTCC CGTGTACCTC TGGGCCCGTG TGGGAGGGTG GCTGGGGTAT TTGGGTGGGA 8640 

CTTGGACTCC GCATAAAGGG AGTCTCGAAG GAGGGAAACT AGGACAGTTC ATAGGCCGGG 8700 

AGCGTGGGGC GCGCACCGCT GTCCCGACGA TTAGCCACCG CGCCCACAGT CACCTCGACC 8760 

5 CGTCCGATCC CGGTATGCCC GGCCGCTCGC TGCAGGGCCT GGCGATCCTG GGCCTGTGGG 8820 

TCTGCGCCAC CGGCCTGGTC GTCCGCGGCC CCACGGTCAG TCTGGTCTCA GACTCACTCG 8880 

TGGATGCCGG GGCCGTGGGG CCCCAGGGCT TCGTGGAAGA GGACCTGCGT GTTTTCGGGG 8940 

AGCTTCATTT TGTGGGGGCC CAGGTCCCCC ACACAAACTA CTACGACGGC ATCATCGAGC 9000 

TGTTTCACTA CCCCCTGGGG AACCACTGCC CCCGCGTTGT ACACGTGGTC ACACTGACCG 9060 

10 CATGCCCCCG CCGCCCCGCC GTGGCGTTCA CCTTGTGTCG CTCGACGCAC CACGCCCACA 9120 

GCCCCGCCTA TCCGACCCTG GAGCTGGGTC TGGCGCGGCA GCCGCTTCTG CGGGTTCGAA 9180 

CGGCAACGCG CGACTATGCC GGTCTGTATG TCCTGCGCGT ATGGGTCGGC AGCGCGACGA 9240 

ACGCCAGCCT GTTTGTTTTG GGGGTGGCGC TCTCTGCCAA CGGGACGTTT GTGTATAACG 9300 

GCTCGGACTA CGGCTCCTGC GATCCGGCGC AGCTTCCCTT TTCGGCCCCG CGCCTGGGAC 9360 

15 CCTCGAGCGT ATACACCCCC GGAGCCTCCC GGCCCACCCC TCCACGGACA ACGACATCCC 9420 

CGTCCTCCCC CCGAGACCCG ACCCCCGCCC CCGGGGACAC AGGGACGCCC GCGCCCGCGA 9480 

GCGGCGAGAG AGCCCCGCCC AATTCCACGC GATCGGCCAG CGAATCGAGA CACAGGCTAA . 9540 

CCGTAGCCCA GGTAATCCAG ATCGCCATAC CGGCGTCCAT CATCGCCTTT GTGTTTCTGG 9600 

GCAGCTGTAT CTGCTTCATC CATAGATGCC AGCGCCGATA CAGGCGCCCC CGCGGCCAGA 9660 

20 TTTACAACCC CGGGGGCGTT TCCTGCGCGG TCAACGAGGC GGCCATGGCC CGCCTCGGAG 9720 

CCGAGCTGCG ATCCCACCCA AACACCCCCC CCAAACCCCG ACGCCGTTCG TCGTCGTCCA 9780 

CGACCATGCC TTCCCTAACG TCGATAGCTG AGGAATCGGA GCCAGGTCCA GTCGTGCTGC 9840 

TGTCCGTCAG TCCTCGGCCC CGCAGTGGCC CGACGGCCCC CCAAGAGGTC TAGGTCCAAG 9900 

CGGGCCGTTC GGCAGGCCCG CCCCACCGCC CCCATCGTGG TTATTTCCCC CCCAATAAAC 9960 

25 CGATGTTATT TGCCTATATG CGTGTGTTGG ATCCCTTTGT GATCGTTCGT CATTCCCCGG 10020 

ATGGCATGGG AGGCGGGTAA TGGATGGGCG GGGCCCGGGG GGGGAGGAAA AAGAATAAAG 10080 

GGGGTAGTGT CGGAGAGGCC CGCCGCGCAT TTAAGGAGTC GCCGCCCCGA CTCTGTGTCT ^0140 

TCGGGTGACT TGGTGCGCCG CCGTCAGCTA GTCTCCGATC TGCCCCGACC GACGGCTCCT 10200 

GCCACCCGAA CATGGCTCGC GGGGCCGGGT TGGTGTTTTT TGTTGGAGTT TGGGTCGTAT 10260 

30 CGTGCCTGGC GGCAGCACCC AGAACGTCCT GGAAACGGGT AACCTCGGGC GAGGACGTGG 10320 

TGTTGCTTCC GGCGCCCGCG GGGCCGGAGG AACGCACCCG GGCCCACAAA CTACTGTGGG 10380 

CCGCGGAACC CCTGGATGCC TGCGGTCCCC TGCGCCCGTC GTGGGTGGCG CTGTGGCCCC 10440 

CCCGACGGGT GCTCGAGACG GTCGTGGATG CGGCGTGCAT GCGCGCCCCG GAACCGCTCG 10500 

CCATAGCATA CAGTCCCCCG TTCCCCGCGG GCGACGAGGG ACTGTATTCG GAGTTGGCGT 10560 

35 GGCGCGATCG CGTAGCCGTG GTCAACGAGA GTCTGGTCAT CTACGGGGCC CTGGAGACGG 10620 

ACAGCGGTCT GTACACCCTG TCCGTGGTCG GCCTAAGCGA CGAGGCGCGC CAAGTGGCGT 10680 

CGGTGGTTCT GGTCGTGGAG CCCGCCCCTG TGCCGACCCC GACCCCCGAC GACTACGACG 10740 

AAGAAGACGA CGCGGGCGTG AGCGAACGCA CGCCGGTCAG CGTTCCCCCC CCAACCCCCC 10800 

CCCGTGGTCC CCCCGTGGCC CCCCCGACGC ACCCTCGTGT TATCCCCGAG GTGTCCCACG 10860 

40 TGCGCGGGGT AACGGTCCAT ATGGAGACCC CGGAGGCCAT TCTGTTTGCC CCCGGGGAGA 10920 

CGTTTGGGAC GAACGTCTCC ATCCACGCCA TTGCCCACGA CGACGGTCCG TACGCCATGG 10980 

ACGTCGTCTG GATGCGGTTT GACGTGCCGT CCTCGTGCGC CGAGATGCGG ATCTACGAAG 11040 

CTTGTCTGTA TCACCCGCAG CTTCCAGAGT GTCTATCTCC GGCCGACGCG CCGTGCGCCG 11100 
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TAAGTTCCTG GGCGTACCGC CTGGCGGTCC 
CCCCGCCGCG ATGTTTTGCC GAGGCTCGCA 
CCTCCACCGT CAATCTGGAA TTCCAGCACG 
GCGTGGTGTA CGTGGACGAT CATATCCACG 
5 CGCAGTACCG GAACGCGGTG GTGGAACAGC 
AGCCCACCCG CCCGCACGTG AGAGCCCCCC 
GCCTCGGGGC GGTGCTGGGG GCGGCCCTGT 
CGTGCATGAC CTGCTGGCGC AGGCGCTCCT 
CGGGCCCCAC TTACATTCGC GTGGCGGACA 

10 GCGAGGGGGA GCGCGACGGG TCCCTGTGGC 
CCACAAATGG ATCCGGCTTT GAGATCTTAT 
GCGAGGGGCG TAAATCTCGC CGCCCGCTCA 
GTCACTCCCA GGCCTCCTAT TCGTCCGTCC 
GTCGGCGATG AACTGATTGC CATCGCGGAC 

15 CCCGGCGCGG GCGGCGCCGC GCCCGCGTGC 
GCCTTTCCCG TGGCCCTCCA CGCCGTGGAC 
GTGCGCTGGC TGCGGGGGGC GGTGGGTCTC 
GTGACGTCAA TCGCCCGAGG CGCATAAAGG 
AAATCGTGAG TCACGGCAAC CGCACCTTCG 

20 TCCGCGATGA CCTCCCGGCC CGCCGACCAA 
CTTTACCCCG CGGCCTCGCC CGTCCCGGCA 
GCCGCCAACG ACTTCCTCGT GCGCATGGGC 
CGGCGCACGC GGTGCGTCGG GCTGGTTATC 
GGGTTCGGGG CACTTTTGGT GTGGCTGCTC 

25 TCTTCGCCCC CACCCCTCGC CGCGACCCAC 
ATTGTCATTA CGGTCTACTA GGTTGTCTTT 
GAAAAGGGTA AGAAATTCTC GGAATTTCAC 
TTCCTCAGTG TTTGGGAAAT CTATTGAACT 
TTGGGGAAAT CTATTGACCT CTCGCCCCCC 

30 TTCCTCCGTG CTGGGGAAAT CTCTCTGCCG 
TTTCCCCATC CGCACCCCAC ATCTGGCGTT 
CGCAGCAACA CACAAAGCGA TTTCAATTTT 
CCTGTCCCCG GGACGTGGTC AGGACCGGGG 
GGCAGTGTGC CGAATATAAC CCCGCGTAGG 

35 AGAAGGCGCA TGCCATCAGC AGGTCGTGCA 
CCGCGGCGAT AAAATTCATG GCGGCCGTCC 
TGGCCCGAAG CCATTGGGTA TGAACCAGCT 
GCGGGGGCGG TGGGTCGTGG GTGTTGAGAG 
GAAAAAAGGC CCGGTGGTCC GCGGGCAGCA 

40 GAGGGTACAA CTCGGAGCCG GGGGACTCCG 
CACGCTTTGG GGCCCGGGTG TCGGACGCGG 
CGTACGTGCG TTGGCGCGGC GATGAGGGGT 
GGCTAGGCAA GCCCGCGGGT TGCGCGGGGT 
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GCAGCTACGC 


CGGCTGTTCC 


AGGACTACGC 


11160 


TGGAACCGGT 


CCCGGGGTTG 


GCGTGGCTGG 


11220 


CCTCCCCCCA 


GCACGCCGGC 


CTCTACCTGT 


11280 


CCTGGGGCCA 


CATGACCATC 


AGCACCGCGG 


11340 


ACCTCCCCCA 


GCGCCAGCCC 


GAGCCCGTCG 


11400 


CTCCCGCGCC 


CTCCGCGCGC 


GGCCCGCTGC 


11460 


TGCTGGCCGC 


CCTCGGGCTG 


TCCGCGTGGG 


11520 


GGCGGGCGGT 


TAAAAGCCGG 


GCCTCGGCGA 


11580 


GCGAGCTGTA 


CGCGGACTGG 


AGTTCGGACA 


11640 


AGGACCCTCC 


GGAGAGACCC 


GACTCTCCCT 


11700 


CACCAACGGC 


TCCGTCTGTA 


TACCCCCATA 


11760 


CCACCTTTGG 


TTCGGGAAGC 


CCGGGCCGTC 


11820 


TCTGGTAAGG 


CGTCTTCCGA 


CGACGCGGAC 


11880 


GCACGCGGGG 


ACCCGCCAGA 


GACCCTGCCC 


11940 


CGCAGACCAC 


CTCGCGGCGG 


CTCCCCCGCG 


12000 


GCCCCCTCCC 


AATTCGTCAC 


CTGGCTCGCC 


12060 


GGGGCCGTCC 


TGTGCGGGAT 


TGCGTTTTAC 


12120 


TCCGGCGGCC 


AGCCCCGCCG 


CAGCTCATAA 


12180 


CCTCCGGCCC 


TCCGCCAGCG 


CCCTTCCGCG 


12240 


GACTCGGTGC 


GTTCCAGCGC 


GTCGGTGCCG 


12300 


GAAGCCTACT 


ACTCGGAAAG 


CGAAGACGAG 


12360 


CGCCAGCAGT 


CGGTCCTAAG 


GCGCCGACGG 


12420 


GCCTGTCTCG 


TCGTGGCCCT 


CCTATCTGGA 


12480 


CGCTAAATGA 


CGCCTCGATG 


TATGGCGCCT 


12540 


GTCCGTATGT 


TAATTGCAAT 


AAAGTGGTTG 


12600 


TTTTTTTGGG 


GGGGGGGGGA 


AGGAAATGCA 


12660 


CCCCCGGGGG 


GGGCAAGTGC 


AGTACCCCAG 


12720 


CTCCCGGCTC 


CTCCGTGTTA 


GGGAAGTCTC 


12780 


CCCCCAGGAG 


GGGGCAGTGC 


AGTACCCCAG 


12840 


GGTACGGGCT 


CCAGACGAAG 


GACCCATACA 


12900 


CTAGAGTCAC 


GACGCATTTG 


CCCCCGTCCC 


12960 


CACGATTTTA 


TTATTAATTA 


CACCAACCAC 


13020 


GTCCGCACCC 


AAACGCACGA 


AACAAATGCT 


13080 


AACACGTCGA 


CGCGTGCGCC 


AAACAGCACC 


13140 


TATGGCGATG 


TGTTTGGACG 


CAGGGCGCAG 


13200 


GCCAGGGCCA 


CAGCGGCGAG 


GACTCCCTGT 


13260 


GCGCCTCCTG 


TCCGACCCTG 


GCTCCCGCCA 


13320 


CACACAGGCG 


GGACACCTCG 


ATCACCGTCC 


13380 


TCTGCAGGTG 


CGCCAGGGCC 


TGGGCGTTGA 


13440 


GGGGCCGGTC 


CGCGCGGTGC 


CGCGAGTGGG 


13500 


GCGCGTTACG 


GATCCCGACG 


CGGGGCAGAA 


13560 


CCGGGCTGCC 


GAGGGGGGCG 


TAGGGGACCG 


13620 


TCCCGTGGGG 


GTCTAGGCTC 


CCTGGGCACC 


13680 
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CGTGGGGGTC GTGGGGGTCG CGGGTCCCTG 
GATCGTGGAA CTCGCGGTTC CCTGGGCTCT 
GGTGCCCTGG GAATTCTTGA TGGTCGGACG 
GCACAGACTC GTAGTAGACC CGAATCTCCA 
5 CCCCGGTGCG GGGGCCCGTC GGTCGGAAGC 
GGCTGCATGC CGTCGGATGG GGTGCCTTTT 
GGGGTTTGGG GGTGGGCCGG GGAAACCCCG 
CCGGCGCGCT GGTTGGGTGG GGGTAGAGGG 
GGCCAGCACG CGATCCTGCC GCTCGTTCGA 

10 ATTTCGACTC GCGGGATCCG ATCGCACGTC 
GGTCACCGCA GTTCTGGCCG CCTCTCGGTC 
GACATCGCCA TACGTCCGGT GTGTGCACCG 
CAGGGCCCAA GACATGGTGT CCCGTCCACG 
GATGTTGGGA TCGGGGCCCC CCCGTCCCGT 

15 CCGTCCCGTC CCCCCGTCCC GTCCCCCCGT 
TCCCCCCGTC CCGTCCCCCC GTCCCGTCCC 
TCCCGTCCCC CCGTCCCGTC CCCCCGTCCC 
CCCCCGGGTC ACCGTACCTG CGATAAGGCT 
ACAGGGTGGG GGGGGGGGGG AGGGAAAGGC 

20 TCTGTATCCG ATCCGATCCG GGTGCGTCGG 
GCTGTGGCCC CCTTCGCGAT GCCGCCGCTG 
GCCCCTGGTG CGGCGGCGAC CGGGACGCCG 
CCCCGTCCGG GCCCGCCTCG GGGCGGGGCC 
CCAGTGCTCG CACTTTGCCC TAATAATATA 

25 GCGTTCTCAC TTCTTTTACC CTGCGGCCCC 
CGCGGGCCCC GGGCAGGGCG CCAGTGCTCG 
GGACGAAGTG CGAACGCTTC GCGTTCTCAC 
GGGCGGAGCC GCCCGCGGAC CAACGGGGCG 
GCCAACGGGA GCGCGGGGCC GGCATCTCAT 

30 GCCCGCCCGC GACGAGGGTC TCATTAGCAT 
GGCGCTAATG AGATGCCGCG CGGGCGGAGC 
ACGGACGCGG ACGCGCGGGC GTCGGGGCGG 
CGGAACCCCG GCGAGCCGGG GCGCGGCGGC 
TTTCCCCCCG CCCCGCGCGC CCCGAGGACT 

35 CACGGAGCGC GGCTACCGAC GCGGCCGCCA 
AAGACACAGG CACACGCACG CACCGCACGG 
CCCCCCACTG CCGCCCCTGA AGAAGAAGAA 
GTCGGCGGAG CAGCGGAAGA AGAAGAAGAC 
GGTCGCGATG GCGGACGAGG ACGGGGGACG 

40 CCCCGGATCT CCGGATCCAG CCGACGGACC 
CGCCGCGCGG CCCGGGTTCG GGTGGCACGG 
CGACGCCGCC GCCGATGCCG ATGCCGACGA 
CGAGCCTGCC GCGGACGGCG TCGTCTCGCC 



GGTATGCGCG 


GGACCCTGGG 


TTCTCTGGGA 


13740 


CGGGGAACCC 


GGGGCTCCCT 


GGGGACACGT 


13800 


GCTTCAGATG 


GCTTCGGGAT 


CGAGAGGGCC 


13860 


CGTTTCCCCG 


CCGCCGGATC 


ATGGTCGCCG 


13920 


GAGTGCCCTT 


CAAGCGTGTC 


CGCTCCTCTG 


13980 


AAGGAAAGGT 


CTCGGCTGCC 


CGCCCCAACC 


14040 


GATGCCATGG 


GGGGGGTCAC 


ACCCTAAGCG 


14100 


GAGTCCCCGG 


TCGACGAGAT 


CGTATCAAGG 


14160 


TCTAGCACAC 


CCACGGGTCT 


GCTGTGTGGG 


14220 


CGGAGGACAC 


AGCAGCGGGA 


GCTCCGGGTC 


14280 


CTCCCGTTCC 


CTTTTATGGA 


TCTCCGCGCA 


14340 


CGAAGAATCC 


AAAAACATGT 


CCGTCGTTTT 


14400 


AAGGCGGCGC 


CCGGCCTGCG 


AGAAAGCGCG 


14460 


CCCCCCGTCC 


CGTCCCCCCG 


TCCCGTCCCC 


14520 


CCCGTCCCCC 


CGTCCCGTCC 


CCCCGTCCCG 


14580 


CCCGTCCCGT 


CCCCCCGTCC 


CGTCCCCCCG 


14640 


GTCCCCCCGT 


CCCGTCCCCC 


CGCCCCGGCG 


14700 


GCAGTGGGTG 


GATGGGTCCT 


CGCGGTACGT 


14760 


AGAACGAAAA 


GGAACCGATG 


CGCCCGCGTC 


14820 


TGCCCCGCTC 


GCCGCCGGCG 


TCTCTGTCTC 


14880 


CCGTCCCGGT 


CTCCGCCGCG 


CAGCCGGTGT 


14940 


GCCCTTTATG 


TGCGCGAGGA 


ACGGCCCGCC 


15000 


CGCGGGATGA 


CGCGGGCCCC 


GGGCAGGGCG 


15060 


TATACTATTA 


GGACGAAGTG 


CGAACGCTTC 


15120 


GCCCCCTTTG 


GGGCGGAGCC 


CGCGGGATGA 


15180 


CACTTTGCCC 


TAATAATATA 


TATACTATTA 


15240 


TTCTTTTACC 


CTGCGGCCCC 


GCCCCCTTTG 


15300 


ACCTCGCCGG 


CCCCAAAGGG 


GCCGGCGGGG 


15360 


TACCACGAAC 


CCGGAAGGGC 


AGGGGAGCGA 


15420 


CGCGGGCGGA 


AGCGGAAGCC 


GCCCGCGCCG 


15480 


CGGCGGCGGC 


GCGACCAACG 


GGCCGCCGCC 


15540 


GGCCGCGCAT 


AATGCGGTTC 


CACCTGGGGG 


15600 


GTCGATCGCT 


CCTCCTCCGC 


GTCCTCCTCC 


15660 


ATATCAGCCA 


GGCGACGGGG 


CGATCGTCCA 


15720 


GGATCTACCC 


GATCGGCGCG 


GAGAGGCGAA 


15780 


GGGGGAGAGA 


GAGACCGCCA 


ACCCCCCCCC 


15840 


GACCCCCCGC 


ACACCCCGGT 


CGGAGGCGAT 


15900 


GACGACGACG 


ACGCAGGGCC 


GCGGGGCCGA 


15960 


TCTCCGGGCC 


GCGGCGGAGA 


CGACCGGCGG 


16020 


GCCGCCCACC 


CCGAACCCGG 


ACCGTCGCCC 


16080 


TGGGCCGGAG 


GAGAACGAAG 


ACGAGGACGA 


16140 


GGCGGCCCCG 


GCGTCCGGGG 


AGGCCGTCGA 


16200 


GCGGCAGCTG 


GCCCTGCTGG 


CCTCGATGGT 


16260 



WO 98/20016 



PCMJS97/20016 



GGACGAGGCC GTTCGCACGA TCCCGTCGCC CCCCCCGGAG CGCGACGGCG CGGAAGAAGA 16320 

AGCAGCCCGC TCGCCTTCTC CGCCGCGGAC CCCCTCCATG TGCGCCGATT ATGGCGAGGA 16380 

GAACGACGAC GACGACGATG ACGAGGACCG CGACGCGGGC CGCTGGGTCC GCGGACCGGA 16440 

GAACGACGTC CGCGGTCCGC GGGGCGTACC CGGACCCCAT GGCCAGCCTG TCGCCGCGAC 16500 

5 CCCCGGCGCC CCGCCGACAC CACCACCACC ACCACCGCCG CCGCCGCCGG CGCGCCCCCC 16560 

GCCGGCGCTC GACCGCCTCT GACTCATCAA AATCCGGATC CTCGTCGTCG GCGTCCTCCG 16620 

CCTCCTCCTC CGCCTCCTCC TCCTCGTCTG CATCCGCCTC CTCGTCTGAC GACGACGACG 16680 

ACGACGCCGC CCGCGCCCCC GCCAGCGCCG CAGACCACGC CGCGGGCGGG ACCCTCGGCG 16740 

CGGACGACGA GGAGGCGGGG GTGCCCGCGA GGGCCCCGGG GGCGGCGCCC CGGCCGAGCC 16800 

10 CGCCCAGGGC CG 16813 

(2) INFORMATION FOR SEQ ID NO: 205: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



Met Ala Asp lie Pro Pro Asp Pro Pro Ala Leu Asn Thr Thr Pro Ala 
25 1 5 10 15 

Asn His Ala Pro Pro Ser Pro Pro Pro Gly Ser Arg Lys Arg Arg Arg 

20 25 30 

Pro Val Leu Pro Ser Ser Ser Glu Ser Glu Gly Lys Pro Asp Thr Glu 
35 40 45 

30 Ser Glu Ser Ser Ser Thr Glu Ser Ser Glu Asp Glu Ala Gly Asp Leu 
50 55 60 

Arg Gly Gly Arg Arg Arg Ser Pro Arg Glu Leu Gly Gly Arg Tyr Phe 
65 70 75 80 

Leu Asp Leu Ser Ala Glu Ser Thr Thr Gly Thr Glu Ser Glu Gly Thr 
35 85 90 95 

Gly Pro Ser Asp Asp Asp Asp Asp Asp Ala Ser Asp Gly Trp Leu Val 

100 105 110 

Asp Thr Pro Pro Arg Lys Ser Lys Arg Pro Arg lie Asn Leu Arg Leu 
115 120 125 

40 Thr Ser Ser Pro Asp Arg Arg Ala Gly Val Val Phe Pro Glu Val Trp 
130 135 140 

Arg Asn Asp Arg Pro lie Arg Ala Ala Gin Pro Gin Ala Pro Ala Gin 
145 150 155 160 
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Ser Ser Gly Asp Arg Ala Ala Ala Pro Arg Arg Ser Ala Arg Gin Ala 

165 170 175 

Gin Met Arg Ser Gly Ala Ala. Trp Thr Leu Asp Leu His Tyr lie Arg 
180 185 190 

5 Gin Cys Val Asn Gin Leu Phe Arg lie Leu Arg Ala Ala Pro Asn Pro 
195 200 205 

Pro Gly Ser Ala Asn Arg Leu Arg His Leu Val Arg Asp Cys Tyr Leu 

210 215 220 

Met Gly Tyr Cys Arg Thr Arg Leu Gly Pro Arg Thr Trp Gly Arg Leu 
10 225 230 235 240 

Leu Gin lie Ser Gly Gly Thr Trp Asp Val Arg Leu Arg Asn Ala lie 

245 250 255 

Arg Glu Val Glu Ala Arg Phe Glu Pro Ala Ala Glu Pro Val Cys Glu 
260 265 270 

15 Leu Pro Cys Leu Asn Ala Arg Arg Tyr Gly Pro Glu Cys Asp Val Gly 
275 280 285 

Asn Leu Glu Thr Asn Gly Gly Ser Thr Ser Asp Asp Glu lie Ser Asp 

290 295 300 

Ala Thr Asp Ser Asp Asp Thr Leu Ala Ser His Ser Asp Thr Glu Gly 
20 305, 310 315 320 

Gly Pro Ser Pro Ala Gly Arg Glu Asn Pro Glu Ser Ala Ser Gly Gly 

325 330 335 

Ala lie Ala Ala Arg Leu Glu Cys Glu Phe Gly Thr Phe Asp Trp Thr 
340 345 350 

25 Ser Glu Glu Gly Ser Gin Pro Trp Leu Ser Ala Val Val Ala Asp Thr 
355 360 365 . 

Ser Ser Ala Glu Arg Ser Gly Leu Pro Ala Pro Gly Ala Cys Arg Ala 

370 375 380 

Thr Glu Ala Pro Glu Arg Glu Asp Gly Cys Arg Lys Met Arg Phe Pro 
30 385 390 395 400 

Ala Ala Cys Pro Tyr Pro Cys Gly His Thr Phe Leu Arg Pro 
405 410 



35 



(2) INFORMATION FOR SEQ ID NO: 206 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



513 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 



Met Ala Asp lie Pro Pro Asp. Pro Pro Ala Leu Asn Thr Thr Pro Ala 

15 10 15 

Asn His Ala Pro Pro Ser Pro Pro Pro Gly Ser Arg Lys Arg Arg Arg 

20 25 30 

Pro Val Leu Pro Ser Ser Ser Glu Ser Glu Gly Lys Pro Asp Thr Glu 

35 40 45 

Ser Glu Ser Ser Ser Thr Glu Ser Ser Glu Asp Glu Ala Gly Asp Leu 

50 55 60 

Arg Gly Gly Arg Arg Arg Ser Pro Arg Glu Leu Gly Gly Arg Tyr Phe 
65 70 75 80 

Leu Asp Leu Ser Ala Glu Ser Thr Thr Gly Thr Glu Ser Glu Gly Thr 

85 90 95 

Gly Pro Ser Asp Asp Asp Asp Asp Asp Ala Ser Asp Gly Trp Leu Val 

100 105 110 

Asp Thr Pro Pro Arg Lys Ser Lys Arg Pro Arg lie Asn Leu Arg Leu 

115 120 125 

Thr Ser Ser Pro Asp Arg Arg Ala Gly Val Val Phe Pro Glu Val Trp 

130 135 140 

Arg Asn Asp Arg Pro lie Arg Ala Ala Gin Pro Gin Ala Pro Ala Gin 
145 150 155 160 

Ser Ser Gly Asp Arg Ala Ala Ala Pro. Arg Arg Ser Ala Arg Gin Ala 

165 170 175 

Gin Met Arg Ser Gly Ala Ala Trp Thr Leu Asp Leu His Tyr lie Arg 

180 185 190 

Gin Cys Val Asn Gin Leu Phe Arg lie Leu Arg Ala Ala Pro Asn Pro 

195 200 205 

Pro Gly Ser Ala Asn Arg Leu Arg His Leu Val Arg Asp Cys Tyr Leu 

210 215 220 

Met Gly Tyr Cys Arg Thr Arg Leu Gly Pro Arg Thr Trp Gly Arg Leu 
225 - 230 235 240 

Leu Gin lie Ser Gly Gly Thr Trp Asp Val Arg Leu Arg Asn Ala lie 

245 250 255 

Arg Glu Val Glu Ala Arg Phe Glu Pro Ala Ala Glu Pro Val Cys Glu 

260 265 270 

Leu Pro Cys Leu Asn Ala Arg Arg Tyr Gly Pro Glu Cys Asp Val Gly 

275 280 285 

Asn Leu Glu Thr Asn Gly Gly Ser Thr Ser Asp Asp Glu lie Ser Asp 

290 295 300 

Ala Thr Asp Ser Asp Asp Thr Leu Ala Ser His Ser Asp Thr Glu Gly 
305 310 315 320 

Gly Pro Ser Pro Ala Gly Arg Glu Asn Pro Glu Ser Ala Ser Gly Gly 
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325 330 335 

Ala He Ala Ala Arg Leu Glu Cys Glu Phe Gly Thr Phe Asp Trp Thr 

340 - 345 350 

Ser Glu Glu Gly Ser Gin Pro Trp Leu Ser Ala Val Val Ala Asp Thr 
5 355 360 365 

Ser Ser Ala Glu Arg Ser Gly Leu Pro Ala Pro Gly Ala Cys Arg Ala 

370 375 380 

Thr Glu Ala Pro Glu Arg Glu Asp Gly Cys Arg Lys Met Arg Phe Pro 
385 390 395 400 

10 Ala Ala Cys Pro Tyr Pro Cys Gly His Thr Phe Leu Arg Pro 

405 410 

(2) INFORMATION FOR SEQ ID NO: 207: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 287 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 

25 Met Gly Val Val Val Val Ser Val Val Thr Leu Leu Asp Gin Arg Asn 
15 10 15 

Ala Leu Pro Arg Thr Ser Ala Asp Asp Ala Leu Trp Ser Phe Leu Leu 

20 25 30 

Arg Gin Cys Arg He Leu Ala Ser Glu Pro Leu Gly Thr Pro Val Val 
30 35 40 45 

Val Arg Pro Ala Asn Leu Arg Arg Leu Ala Glu Pro Leu Met Asp Leu 

50 55 60 

Pro Lys Phe Trp He Val Arg Thr Arg Ser Cys Arg Cys Pro Pro Asn 
65 70 75 80 

35 Thr Thr Thr Gly Leu Phe Ala Glu Asp Asp Pro Leu Glu Ser He Glu 

85 90 95 

He Leu Asp Ala Pro Ala Cys Phe Arg Leu Leu His Gin Glu Arg Pro 

100 105 110 

Gly Pro His Arg Leu Tyr His Leu Trp Val Val Gly Ala Ala Asp Leu 
40 115 120 125 

Cys Val Pro Phe Leu Glu Tyr Ala Gin Lys Thr Arg Leu Gly Phe Arg 

130 135 140 

Phe He Ala Met Lys Thr Asn Asp Ala Trp Val Gly Glu Pro Trp Pro 
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145 150 155 160 

Leu Pro Asp Arg Phe Leu Pro Glu Arg Thr Val Ser Trp Thr Pro Phe 

165 - 170 175 

Pro Ala Ala Pro Asn His Pro Leu Glu Asn Leu Leu Ser Arg Tyr Glu 
5 180 185 190 

Tyr Gin Tyr Gly Val Val Val Pro Gly Asp Arg Glu Arg Ser Cys Leu 

195 200 205 

Arg Trp Leu Arg Ser Leu Val Ala Pro His Asn Lys Pro Arg Pro Ala 
210 215 220 

10 Ser Ser Arg Pro His Pro Ala Thr His Pro Thr Gin Arg Pro Cys Phe 
225 230 235 240 

Thr Cys Met Gly Arg Pro Glu lie Pro Asp Glu Pro Ser Trp Gin Thr 

245 250 255 

Gly Asp Asp Asp Pro Gin Asn Pro Gly Pro Pro Leu Ala Val Gly Asp 
15 260 265 270 

Glu Trp Pro Pro Ser Ser His Val Cys Tyr Pro lie Thr Asn Leu 
275 280 285 



20 



(2) INFORMATION FOR SEQ ID NO: 208: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 479 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 

30 

Met Ala Cys Arg Lys Phe Cys Gly Val Tyr Arg Arg Pro Asp Lys Arg 

15 10 15 

Gin Glu Ala Ser Val Pro Pro Glu Thr Asn Thr Ala Pro Ala Phe Pro 
20 25 30 

35 Ala Ser Thr Phe Tyr Thr Pro Ala Glu Asp Ala Tyr Leu Ala Pro Gly 
35 40 45 

Pro Pro Glu Thr lie His Pro Ser Arg Pro Pro Ser Pro Gly Glu Ala 

50 55 60 

Ala Arg Leu Cys Gin Leu Gin Glu He Leu Ala Gin Met His Ser Asp 
40 65 70 75 80 

Glu Asp Tyr Pro He Val Asp Ala Ala Gly Ala Glu Glu Glu Asp Glu 

85 90 95 

Ala Asp Asp Asp Ala Pro Asp Asp Val Ala Tyr Pro Glu Asp Tyr Ala 
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100 105 110 

Glu Gly Arg Phe Leu Ser Met Val Ser Ala Ala Pro Leu Pro Gly Ala 

115 . - 120 125 

Ser Gly His Pro Pro Val Pro Gly Arg Ala Ala Pro Pro Asp Val Arg 
5 130 135 140 

Thr Cys Asp Ser Gly Lys Val Gly Ala Thr Gly Phe Thr Pro Glu Glu 
145 150 155 160 

Leu Asp Thr Met Asp Arg Glu Ala Leu Arg Ala lie Ser Arg Gly Cys 
165 170 175 

10 Lys Pro Pro Ser Thr Leu Ala Lys Leu Val Thr Gly Leu Gly Phe Ala 
180 185 190 

lie His Gly Ala Leu He Pro Gly Ser Glu Gly Cys Val Phe Asp Ser 

195 200 205 

Ser His Pro Asn Tyr Pro His Arg Val He Val Lys Ala Gly Trp Tyr 
15 210 215 220 

Ala Ser Thr Asn His Glu Ala Arg Leu Leu Arg Arg Leu Asn His Pro 
225 230 235 240 

Ala He Leu Pro Leu Leu Asp Leu His Val Val Ser Gly Val Thr Cys 
245 250 255 

20 Leu Val Leu Pro Lys Tyr His Cys Asp Leu Tyr Thr Tyr Leu Ser Lys 
260 265 270 

Arg Pro Ser Pro Leu Gly His Leu Gin He Thr Ala Val Ser Arg Gin 

275 280 285 

Leu Leu Ser Ala He Asp Tyr Val His Cys Glu Gly lie He His Arg 
25 290 295 300 

Asp He Lys Thr Glu Asn lie Leu He Asn Thr Pro Glu Asn He Cys 
305 310 315 320 

. Leu Gly Asp Phe Gly Ala Ala Cys Phe Val Arg Gly Cys Arg Ser Ser 
325 330 335 

30 Pro Phe His Tyr Gly He Ala Gly Thr lie Asp Thr Asn Ala Pro Glu 
340 345 350 

Val Leu Ala Gly Asp Pro Tyr Thr Gin Val He Asp lie Trp Ser Ala 

355 360 365 

Gly Leu Val He Phe Glu Thr Ala Val His Thr Ala Ser Leu Phe Ser 
35 370 375 380 

Ala Pro Arg Asp Pro Glu Arg Arg Pro Cys Asp Asn Gin He Ala Arg 
385 390 395 400 

lie lie Arg Gin Ala Gin Val His Val Asp Glu Phe Pro Thr His Ala 
405 410 415 

40 Glu Ser Arg Leu Thr Ala His Tyr Arg Ser Arg Ala Ala Gly Asn Asn 
420 425 430 

Arg Pro Ala Trp Trp Ala Trp Thr Arg Tyr Tyr Lys lie His Thr Asp 
435 440 445 
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Val Glu Tyr Leu lie Cys Lys Ala Leu Thr Phe Asp Ala Ala Leu Arg 

450 455 460 

Pro Ser Ala Ala Glu Leu Leu- Arg Leu Pro Leu Phe His Pro Lys 
465 470 475 

5 

(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 37 amino acids 

10 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209 : 

Val Gly Gly Leu Cys Leu Met lie Leu Gly Met Ala Cys Leu Leu Glu 
1 5 10 15 

20 Val Leu Arg Arg Leu Gly Arg Glu Leu Ala Arg Cys Cys Pro His Ala 
20 25 30 

Gly Gin Phe Ala Pro 

25 (2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 385 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

Met Gly Arg Leu Thr Ser Gly Val Gly Thr Ala Ala Leu Leu Val Val 

1 5 10 15 

Ala Val Gly Leu Arg Val Val Cys Ala Lys Tyr Ala Asp Pro Ser Leu 
40 20 25 30 

Lys Met Ala Asp Pro Asn Arg Phe Arg Gly Lys Asn Leu Pro Val Leu 

35 40 45 . 

Asp Gin Leu Thr Asp Pro Pro Gly Val Lys Arg Val Tyr His lie Gin 
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50 55 60 

Pro Ser Leu Glu Asp Pro Phe Gin Pro Pro Ser lie Pro He Thr Val 
65 70 - 75 80 

Tyr Tyr Ala Val Leu Glu Arg Ala Cys Arg Ser Val Leu Leu His Ala 
5 85 90 95 

Pro Ser Glu Ala Pro Gin He Val Arg Gly Ala Ser Asp Glu Ala Arg 

100 105 110 

Lys His Thr Tyr Asn Leu Thr He Ala Trp Tyr Arg Met Gly Asp Asn 
115 120 125 

10 Cys Ala He Pro He Thr Val Met Glu Tyr Thr Glu Cys Pro Tyr Asn 
130 135 140 

Lys Ser Leu Gly Val Cys Pro He Arg Thr Gin Pro Arg Trp Ser Tyr 
145 150 155 160 

Tyr Asp Ser Phe Ser Ala Val Ser Glu Asp Asn Leu Gly Phe Leu Met 
15 165 170 175 

His Ala Pro Ala Phe Glu Thr Ala Gly Thr Tyr Leu Arg Leu Val Lys 

180 185 190 

lie Asn Asp Trp Thr Glu lie Thr Gin Phe He His Arg Ala Arg Ala 
195 200 205 

20 Ser Cys Lys Tyr Ala Leu Pro Leu Arg lie Pro Pro Ala Ala Cys Leu 
210 . 215 220 

Thr Ser Lys Ala Tyr Gin Gin Gly Val Thr Val Asp Ser He Gly Met 
225 230 235 240 

Leu Pro Arg Phe He Pro Glu Asn Gin Arg Thr Val Ala Lys Leu Lys 
25 245 250 255 

He Ala Gly Trp His Gly Pro Lys Pro Pro Tyr Thr Ser Thr Leu Leu 

260 265 270 

Pro Pro Glu Leu Ser Asp Thr Thr Asn Ala Thr Gin Pro Glu Leu Val 
275 280 285 

30 Pro Glu Asp Pro Glu Asp Ser Ala Leu Leu Glu Asp Pro Ala Gly Thr 
290 295 300 

Val Ser Ser Gin He Pro Pro Asn Trp His lie Pro Ser lie Gin Asp 
305 310 315 320 

Val Ala Pro His His Ala Pro Ala Ala Pro Ser Asn Pro Gly Leu lie 
35 325 330 335 

lie Gly Ala Gly Ser Thr Leu Ala Val Leu Val lie Gly Gly lie Ala 

340 345 350 

Phe Trp Val Arg Arg Arg Ala Gin Met Ala Pro Lys Arg Leu Arg Leu 
355 360 365 

40 Pro His lie Arg Asp Asp Asp Ala Pro Pro Ser His Gin Pro Leu Phe 
370 375 380 

Tyr 
385 
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(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 



Met Pro Gly Arg Ser Leu Gin Gly Leu Ala lie Leu Gly Leu Trp Val 
15 1 5 10 15 

Cys Ala Thr Gly Leu Val Val Arg Gly Pro Thr Val Ser Leu Val Ser 

20 25 30 

Asp Ser Leu Val Asp Ala Gly Ala Val Gly Pro Gin Gly Phe Val Glu 
35 40 45 

20 Glu Asp Leu Arg Val Phe Gly Glu Leu His Phe Val Gly Ala Gin Val 
50 55 60 

Pro His Thr Asn Tyr Tyr Asp Gly lie He Glu Leu Phe His Tyr Pro 
65 70 75 80 

Leu Gly Asn His Cys Pro Arg Val Val His Val Val Thr Leu Thr Ala 
25 85 90 95 

Cys Pro Arg Arg Pro Ala Val Ala Phe Thr Leu Cys Arg Ser Thr His 

100 105 110 

His Ala His Ser Pro Ala Tyr Pro Thr Leu Glu Leu Gly Leu Ala Arg 
115 120 125 

30 Gin Pro Leu Leu Arg Val Arg Thr Ala Thr Arg Asp Tyr Ala Gly Val 
130 135 140 

Leu Arg Val Trp Val Gly Ser Ala Thr Asn Ala Ser Leu Phe Val Leu 
145 150 155 160 

Gly Val Ser Ala Asn Gly Thr Phe Val Tyr Asn Gly Ser Asp Tyr Gly 
35 165 170 175 

Ser Cys Asp Pro Ala Gin Leu Pro Phe Ser Ala Pro Arg Leu Gly Pro 

180 185 190 

Ser Ser Val Tyr Thr Pro Gly Ala Ser Arg Pro Thr Pro Pro Arg Thr 
195 200 205 

40 Thr Thr Ser Pro Ser Ser Pro Arg Asp Pro Thr Pro Ala Pro Gly Asp 
210 215 220 

Thr Gly Thr Pro Ala Pro Ala Ser Gly Glu Arg Ala Pro Pro Asn Ser 
225 230 235 240 
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Thr Arg Ser Ala Ser Glu Ser Arg His Arg Leu Thr Val Ala Gin Val 

245 250 255 

He Gin lie Ala He Pro Ala- Ser He He Ala Phe Val Phe Leu Gly 
260 26.5 270 

5 Ser Cys He Cys Phe He His Arg Cys Gin Arg Arg Tyr Arg Arg Pro 
275 280 285 

Arg Gly Gin He Tyr Asn Pro Gly Gly Val Ser Cys Ala Val Asn Glu 

290 295 300 

Ala Ala Met Ala Arg Leu Gly Ala Glu Leu Arg Ser His Pro Asn Thr 
10 305 310 .315 320 

Pro Pro Lys Pro Arg Arg Arg Ser Ser Ser Ser Thr Thr Met Pro Ser 

325 330 335 

Leu Thr Ser He Ala Glu Glu Ser Glu Pro Gly Pro Val Val Leu Leu 
340 345 350 

15 Ser Val Ser Pro Arg Pro Arg Ser Gly Pro Thr Ala Pro Gin Glu Val 
355 360 365 

(2) INFORMATION FOR SEQ ID NO: 212: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 528 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:212: 

30 Met Arg Ala Gly Leu Val Phe Phe Val Gly Val Trp Val Val Ser Cys 
1 5 10 15 

Leu Ala Ala Ala Pro Arg Thr Ser Trp Lys Arg Val Thr Ser Gly Glu 

20 25 30 

Asp Val Val Leu Leu Pro Ala Pro Ala Gly Pro Glu Glu Arg Thr Arg 
35 35 40 45 

Ala His Lys Leu Leu Trp Ala Ala Glu Pro Leu Asp Ala Cys Gly Pro 

50 55 60 

Leu Arg Pro Ser Trp Val Trp Pro Pro Arg Arg Val Leu Glu Thr Val 
65 70 75 80 

40 Val Asp Ala Ala Cys Met Arg Ala Pro Glu Pro Leu Ala He Ala Tyr 

85 90 95 

Ser Pro Pro Phe Pro Ala Gly Asp Glu Gly Ser Glu Leu Ala Trp Arg 
100 105 110 
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Asp Arg Val Ala Val Val Asn Glu Ser Leu Val lie Tyr Gly Ala Leu 

115 120 125 

Glu Thr Asp Ser Gly Thr Leu- Ser Val Val Gly Leu Ser Asp Glu Ala 
130 135 140 

5 Arg Gin Val Ala Ser Val Val Leu Val Val Glu Pro Ala Pro Val Pro 
145 150 155 160 

Thr Pro Thr Pro Asp Asp Tyr Asp Glu Glu Asp Asp Ala Gly Val Ser 

165 170 175 

Thr Pro Val Ser Val Pro Pro Pro Thr Pro Pro Arg Gly Pro Pro Val 
10 180 185 190 

Ala Pro Pro Thr His Pro Arg Val lie Pro Glu Val Ser His Val Arg 

195 200 205 

Gly Val Thr Val His Met Pro Glu Ala lie Leu Phe Ala Pro Gly Glu 
210 215 220 

15 Thr Phe Gly Thr Asn Val Ser lie His Ala lie Ala His Asp Asp Gly 
225 230 235 240 

Pro Tyr Ala Met Asp Val Val Trp Met Arg Phe Asp Val Pro Ser Ser 

245 250 255 

Cys Ala Glu Met Arg lie Tyr Glu Ala Cys Leu Tyr His Pro Gin Leu 
20 260 265 270 

Pro Glu Cys Leu Ser Pro Ala Asp Ala Pro Cys Ala Val Ser Ser Trp 

275 280 285 

Ala Tyr Arg Leu Ala Val Arg Ser Tyr Ala Gly Cys Ser Arg Thr Thr 
290 295 300 

25 Pro Pro Pro Arg Cys Phe Ala Glu Ala Arg Met Glu Pro Val Pro Gly 
305 310 315 320 

Leu Ala Trp Leu Ala Ser Thr Val Asn Leu Glu Phe Gin His Asp Gin 

325 330 335 

His Ala Gly Leu Cys Val Val Tyr Val Asp Asp His lie His Ala Trp 
30 340 345 350 

Gly His Met Thr He Ser Thr Ala Ala Gin Tyr Arg Asn Ala Val Val 

355 360 365 

Glu Gin His Leu Pro Gin Arg Gin Pro Glu Pro Val Glu Pro Trp His 
370 375 380 

35 Val Arg Ala Pro Pro Pro Ala Pro Ser Arg Pro Leu Arg Leu Gly Ala 
385 390 395 400 

Val Leu Gly Ala Ala Leu Leu Leu Ala Ala Leu Gly Leu Ser Ala Trp 

405 410 415 

Ala Cys Met Thr Cys Trp Arg Arg Arg Ser Trp Arg Ala Val Lys Ser 
40 420 425 430 

Arg Ala Ser Ala Thr Gly Pro Thr Tyr He Arg Val Ala Asp Ser Glu 

435 440 445 

Leu Tyr Ala Asp Trp Ser Ser Asp Ser Glu Gly Glu Arg Asp Gly Ser 
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450 455 460 

Leu Trp Gin Asp Pro Pro Glu Arg Pro Asp Ser Pro Ser Thr Asn Gly 
465 470 - 475 480 

Ser Gly Phe Glu He Leu Ser Pro Thr Ala Pro Ser Val Tyr Pro His 
5 485 490 495 

Ser Glu Gly Arg Lys Ser Arg Arg Pro Leu Thr Thr Phe Gly Ser Gly 

500 505 510 

Ser Pro Gly Arg Arg His Ser Gin Ala Ser Tyr Ser Ser Val Leu Trp 
515 520 525 

10 

(2) INFORMATION FOR SEQ ID NO: 213: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 



Val His Ala Val Asp Ala Pro Ser Gin Phe Val Thr Trp Leu Ala Val 
1 5 10 15 

25 Arg Trp Leu Arg Gly Ala Val Gly Leu Gly Ala Val Leu Cys Gly He 
20 25 30 

Ala Phe Tyr Val Thr Ser He Arg Ala 
35. 40 

30 (2) INFORMATION FOR SEQ ID NO: 214: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:214: 

Met Thr Ser Arg Pro Ala Asp Gin Asp Ser Val Arg Ser Ser Ala Ser 
1 5 10 15 
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Val Pro Leu Tyr Pro Ala Asp Val Pro Ala Glu Ala Tyr Tyr Ser Glu 

20 25 30 • 

Ser Glu Asp Glu Ala Ala Asn- Asp Phe Leu Val Arg Met Gly Arg Gin 
35 40 45 

5 Gin Ser Val Leu Arg Arg Arg Arg Arg Arg Thr Arg Cys Val Gly Leu 
50 55 60 

Val lie Ala Cys Leu Val Val Leu Ser Gly Gly Phe Gly Ala Leu Leu 
65 70 75 80 

Val Trp Leu Leu Arg 
10 85 

(2) INFORMATION FOR SEQ ID NO: 2 15: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 227 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:215: 

Met Ser Ala Glu Gin Arg Lys Lys Lys Lys Thr Thr Thr Thr Thr Gin 
25 1 5 10 15 

Gly Arg Gly Ala Glu Val Ala Met Ala Asp Glu Asp Gly Gly Arg Leu 

20 25 30 

Arg Ala Ala Ala Glu Thr Thr Gly Gly Pro Gly Ser Pro Asp Pro Ala 
35 40 45 

30 Asp Gly Pro Pro Pro Thr Pro Asn Pro Asp Arg Arg Pro Ala Ala Arg 
50 55 60 

Pro Gly Phe Gly Trp His Gly Gly Pro Glu Glu Asn Glu Asp Glu Asp 
65 70 75 80 

Asp Asp Ala Ala Ala Asp Ala Asp Ala Asp Glu Ala Ala Pro Ala Ser 
35 85 90 95 

Gly Glu Ala Val Asp Glu Pro Ala Ala Asp Gly Val Val Ser Pro Arg 

100 105 110 

Gin Leu Ala Leu Leu Ala Ser Met Val Asp Glu Ala Val Arg Thr lie 
115 120 125 

40 Pro Ser Pro Pro Pro Glu Arg Asp Gly Ala Glu Glu Glu Ala Ala Arg 
130 135 140 

Ser Pro Ser Pro Pro Arg Thr Pro Ser Met Cys Ala Asp Tyr Gly Glu 
145 150 155 160 
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Glu Asn Asp Asp Asp Asp Asp Asp Asp Asp Arg Asp Ala Gly Arg Trp 

165 170 175 

Val Arg Gly Pro Glu Asn Asp- Val Arg Gly Pro Arg Gly Val Pro Gly 
180 185 190 

5 Pro His Gly Gin Pro Val Ala Ala Thr Pro Gly Ala Pro Pro Thr Pro 
195 200 .205 

Pro Pro Pro Pro Pro Pro Pro Pro Pro Ala Arg Pro Pro Pro Ala Leu 

210 215 220 

Asp Arg Leu 
10 225 

{2) INFORMATION FOR SEQ ID NO: 21 6: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 227 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

Met Ser Ala Glu Gin Arg Lys Lys Lys Lys Thr Thr Thr Thr Thr Gin 
25 1 5 10 15 

Gly Arg Gly Ala Glu Val Ala Met Ala Asp Glu Asp Gly Gly Arg Leu 

20 25 30 

Arg Ala Ala Ala Glu Thr Thr Gly Gly Pro Gly Ser Pro Asp Pro Ala 
35 40 45 

30 Asp Gly Pro Pro Pro Thr Pro Asn Pro Asp Arg Arg Pro Ala Ala Arg 
50 55 60 

Pro Gly Phe Gly Trp His Gly Gly Pro Glu Glu Asn Glu Asp Glu Asp 
65 70 75 80 

Asp Asp Ala Ala Ala Asp Ala Asp Ala Asp Glu Ala Ala Pro Ala Ser 
35 85 90 95 

Gly Glu Ala Val Asp Glu Pro Ala Ala Asp Gly Val Val Ser Pro Arg 

100 105 110 

Gin Leu Ala Leu Leu Ala Ser Met Val Asp Glu Ala Val Arg Thr lie 
115 120 125 

40 Pro Ser Pro Pro Pro Glu Arg Asp Gly Ala Glu Glu Glu Ala Ala Arg 
130 135 140 

Ser Pro Ser Pro Pro Arg Thr Pro Ser Met Cys Ala Asp Tyr Gly Glu 
145 150 155 160 
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Glu Asn Asp Asp Asp Asp Asp Asp Asp Asp Arg Asp Ala Gly Arg Trp 

165 170 175 

Val Arg Gly Pro Glu Asn Asp-Val Arg Gly Pro Arg Gly Val Pro Gly 
180 185 190 

5 Pro His Gly Gin Pro Val Ala Ala Thr Pro Gly Ala Pro Pro Thr Pro 
195 200 205 

Pro Pro Pro Pro Pro Pro Pro Pro Pro Ala Arg Pro Pro Pro Ala Leu 

210 215 220 

Asp Arg Leu 
10 225 

(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

CGCCCGCGTT TCTGCCGCCC GCGCGCTCCT GTGTGGACCC CGGGGTGGGC 
GTGCCGTGGG TGTGGCGGCG GGGCGCGGGC CGGGGCCGGG GCTCGCTGGT 

25 AAAGAAAAGA TCGCCACCGT GTGTTCGTCT GTGTGTTCTG CGCGGCGCCG 
GCCGGGCGGG GCGGTGGGGC GGGGCCGGGG TCGCGGCGGG GAAGGAAGGA 
AAGCGCCGGG AGGGGGCGCC GGCGCGACGC GGGCGGCCGG GCGGGGGCGC 
GGCGGGGGCG CGCGGCGGCC GGGCGGGGGC GCGCGGCGGC CGGGCGGGGG 
CCGCGTCGCC CCTCGGGTTC CCAAGACCTA TCACGTGTGC GCAGGGGAGG 

30 GGGGAGGGGA GGACGCGGGG GAGGGGAGGA CGCGGGGGAG GGGAGGACGC 
AGGACGCGGG GGATATATAA AGCGGTAGAA AGCGCGGGAA TGGGCATATT 
GATTCGGTTG CTCGCGGTTG TCTTGTTTGG ACGTTTTTTA TGCGGGAACA 
CCGGTTACAC TGTCCGCTCG CTATGGGGTT CGTCTGTCTG TTTGGGCTTG 
AGCCTGGGGG GCGTGGGGTG GGTCACAGGC AACCGAATAT GTTCTTCGTA 

35 CAAAGAGGTG GGGGACATAC TAAGAGTGCC TTGCATGCGG ACCCCCGCGG 
TTGGCGCTAC GAGGCCCCGT CCGTTATTGA CTATGCCCGC ATAGACGGAA 
CTATCACTGC CCGGGGTTGG ACACGTTTTT GTGGGATAGG CACGCCCAGA 
GGTTAACCCC TTTCTCTTTG CGGCGGGATT TTTGGAGGAC TTGAGTCACT 
GGCCGACACC CAGGAAACAA CGACGCGCCG GGCCCTTTAT AAAGAGATAC 

40 GGGCAGTCGA AAACAGGCCG TCAGCCACGC ACCCGTCAGG GCCGGGTGTG 
CTACTCACGC ACTCGCCGCT GCGTCGGGCG ACGCGATTTA CGGCCTGCCA 
AACGTGGGAA CCGCCTGTGT CGTCGGACGA TGAAGCGAGC TCGCAGTCGA 
CACCCAGCCG CCCGTCCTCG CCCTTTCGAA CGCCCCCCCA CGGCGGGTCT 



GGCGGGGGGG 60 

CTGCCGAAGT 120 

GGGCCCCCCT 180 

AAGGCCCCGG 240 

GCGGCGGCCG 300 

CGCGCTTTCC 360 

GGAGGACGCG 420 

GGGGGAGGGG 480 

GGACCCGCGT 540 

AGGGGGCTTA 60 0 

TCGTTATGGG 660 

GTGTTATTGC 720 

ACGATGTTTC 780 

TATTTCTTCG 840 

GGGCGTATCT 900 

CTGTGTTTCC 960 

GCGATGCGTT 1020 

TAAACTTTGA 1080 

ACACCACGTC 1140 

AGCCCCTCGC 1200 

CCCCGACGCG 1260 
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AGGTCGGCGC CGGCATACTC GCCTCCGACG CAACTAGCCA CGTCTGCATC GCAAGCCACC 1320 

CTGGGTCGGG AGCAGGATAT CCGACCCGTC TAGCGGCCGG GTCGGCTGTC CAGCGTCGTC 1380 

GCCCTAGAGG CTGTCCGCCG GGCGTGATGT TTTCCGCATC TACGACCCCC GAACAGCCCC 1440 

TGGGGCTGTC GGGCGATGCG ACGCCGCCCC TGCCGACTTC CGTGCCCCTG GACTGGGCCG 1500 

5 CGTTTCGGCG CGCGTTTCTG ATCGACGACG CCTGGCGGCC CCTGTTGGAG CCGGAGCTCG 1560 

CGAACCCCCT AACCGCGCGC CTCCTCGCGG AGTATGACCG TCGGTGCCAG ACCGAAGAGG 1620 

TGCTGCCGCC GCGGGAGGAT GTGTTCTCCT GGACGCGGTA TTGTACCCCC GACGACGTGC 1680 

GCGTGGTTAT CATCGGGCAG GACCCGTACC ACCATCCCGG CCAGGCGCAC GGCCTGGCGT 1740 

TTAGCGTGCG TGCGGATGTG CCGGTGCCTC CGAGTCTACG GAACGTGCTG GCGGCGGTAA 1800 

10 AAAATTGTTA CCCCGACGCG CGCATGAGCG GCCGCGGCTG CCTGGAAAAG TGGGCTCGCG 1860 

ACGGCGTGCT GTTGTTGAAC ACGACCCTGA CCGTCAAGCG CGGGGCGGCG GCGTCCCACT 1920 

COAAGCTTGG ATGGGACCGC TTTGTGGGCG GGGTGGTCCG ACGGCTGGCC GCGCGCCGCC 1980 

CGGGCCTGGT CTTTATGCTC TGGGGCGCCC ATGCCCAGAA CGCGATCAGG CCCGACCCTC 2040 

GCCAACACTA CGTCCTCAAG TTTTCTCACC CGTCGCCCCT CTCCAAGGTC CCGTTTGGGA 2100 

15 CGTGCCAGCA TTTCCTCGCC GCGAATCGCT ACCTCGAAAC CCGGGACATT ATGCCTATCG 2160 

ACTGGTCGGT ATAAGATGCC GACATCCGGG GTCTTGATTT ACGAGGGGGC AATTAATAAA 2220 

GACTGTTGAT GGTTAAATCT CGGGTCTCAT ACCGGTCCGT GATGTCGGGC GTGGGGGAAG 2280 

AGAGGGTCCC CTCTGCGTTT ACTATCCTTG CCTCGTGGGG CTGGACGTTT GCACCCCAGA 2340 

ACCATGATCC TGGCGCGTCG CCGAATACGA CGCCCATAGA GTCGATTGCG GGGACCGCAC 2400 

20 CGGACGCGCA CGTGGGGCCT CTCGACGGAG AGCCGGACCG GGATGCGATC TCCCCGCTTA 2460 

CGTCGAGCGT GGCCGGCGAC CCGCCGGGGG CGGACGGCCC CTACGTCACC TTTGATACTC 2520 

TGTTTATGGT ATCTTCGATC GACGAACTGG GGCGCCGCCA GCTCACGGAT ACGATCCGTA 2580 

AGGACCTGCG GCTGTCGCTG GCCAAGTTCA GCATCGCGTG TACCAAGACC TCGTCGTTTT 2640 

CGGGGACGGC CGCGCGCCAG CGCAAGCGCG GAGCACCGCC GCAACGCACA TGCGTACCAC 2700 

25 GCAGCAACAA GAGCCTCCAG ATGTTCGTTT TGTGCAAGCG CGCCAACGCC GCGCAGGTGC 2760 

GCGAGCAGCT GCGGGCGGTT ATTCGGTCGC GCAAGCCGCG CAAGTATTAC ACGCGGTCCT 2820 

CGGATGGGCG GCTCTGCCCG GCCGTCCCCG TGTTTGTACA CGAGTTTGTT TCGTCCGAAC 2880 

CCATGCGCCT CCATCGAGAT AACGTCATGC TGTCTACGGA ACCAGACTAA GCACCCCCGC 2940 

CGTCCCCTTT CTTTTCCCCC TACCCTTCCC CCCGTTACTG ATGTGTTGTG ACGTTTCAAT 3000 

30 AAATAACACG TAGCTTATTT TGTTGGATGA TGGATTGATT GATTTTATTG ACCGTTCGTT 3060 

CGCCCGGCGG TGCCGTCGCC GCGCGCAGAG GGAATATGCA AGCGGGCGGG GTGGGGAGGA - 3120 

AAGAAGGTTT CAGGTTCCGG GGGTTGGGTC TGCGTCGTCC AGGGTGGGGC TGATCTGAAT 3180 

TTCCCGCAGA ACCTCGACCA GTAGGTCTGT TGTGTTTGCT GGGAACTCGC CCGCCGTTGG 3240 

GGATACGGGG GCGGGGGGTG TGGTTGGGCG GACGTCCAGG GGTGCGTTAT CGCACCCCCG 3300 

35 CGCCGCCTCG GGGGCCGTCC CGTAGATCGT TGCGGTGATG TAGATGGTGT CCGGGGTCCA 3360 

CACCACCGTC AGGATGCCGG CCGTCGCACT CCGGACGCTT TCGCCGTGCG ATGAGCTGAC 3420 

CCAGGAGTCA AAGGGGTACG CGTACATATG GGCGTCCCAC CAGCGCTCCA GCCTCTGGGT 3480 

ACTAGCGCGT CCTATAAAGC GGTATGCGCA AAATTCGGCA CGACAGTCGA TAATCACCAG 3540 

CAGCCCGATG GGGGTGTGTT GTATCACCAC GCCTCCGCGG GGCAGGCGGT CCTGGCGCGC 3600 

40 TCGACCCCGC GTCAGAACCG CGCGCGTCCC TGACTCAAAC ACGTGCACCA CCTGTGCCGC 3660 

GTCCGGCAGC GCGCTCGTTA GCGACGCCCT GGGGTGATGT AGGCTGTACG CGATGGTCGT 3720 

CTGGGGGTTC CCCATGTCTC GGGGGGGTGG GGGTGAATGT CACCCGGCCC GGGTGCGGTG 3780 

GGAACGCGAG GGAATGGAGG GTTAATAGAC AATGACCACA TTCGGATCGC GTAGAGCAGA 3840 
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TAGTATGTGC TCGCTAATGA CGTCATCGCG TTCGTGGCGC TCCCGGAGCG GGTTTAGATT 3900 

CATGTGCAGG AACTCGGATG AGGTGGTGCG GGACATGGCT ACGTACGCGC TGTTTAGGCG 3960 

CAGGTTTCCG GGCGTGAAGC ATATGGCGAC CTTGTCCAGA CTGAGCCCCT GGGAGCGCGT 4020 

GATGGTCATC GCGAGTTTGG AGCTGATGCC GTAGTCGGCG TTGATGGCCA TGGCCAGCTC 4080 

5 CGTGGAGTCG ATCGACTCGA CAAACTCACT GATGTTGGTA TTGACGACAG ACATGAAGCC 4140 

GTGCTGGTCC CGCAGGACGA TGTAGGGCAG GGGGGACTCC TCCAAGAACT CGGCCACGCC 4200 

GGCCGTCGCG TGCCGCCGCC GCAGCTCCTC CGCGAACGCG AACACCCGGG TGTACGTGTA 4260 

CCCCATCAGC GTGTAGTTGT CCGTCTGCAG GGCCACGGAC ATCAGCCCCC CGCGCGGCGA 4320 

GCCGGTCAGC AGCTCGCAGC CCCGGAAGAT GACATTGTCC ACGTAGGTGC TGAAGGGGGC 4380 

10 GCTCTCAAAC ACCTCCCCGA AGAGCTCCCG TAGGATAAGG TATCGCCCCA GAAAGGCCCT 4440 

CTTCAGGAGC CCAAACTGGG CGTGGACGGC CGCGGTGGTC TCCGGCTCTT CGAGGGCGTA 4500 

GTGGCAGTAG AACACGTCCA GCTGCTGTTC GTCCAGCCCG GCGAAGATAA CGTCAAGGTC 4560 

GTCGTCGGGG AAGTCGTCCG GGCCCCCGTC CCGCGGGCCC AGGTGCTTAA AATTGAACGC 4620 

ACGCTCCCCC GGAGAGCGGT CGCTGGTGTC GGCGGCCCTG GTTGCCGATG CGCCGGCGGC 4680 

15 GTCCCGGCGT AGCGACAGGA GTTCTGCCGT CAGCTCCCCT AGGCGGCCGT AGGCCAGGGT 4740 

CCTCTGGGTC GCGTCCAGGC CGGGGCGCTG GAGAAAGTTG TAAAAGTGAA TCAGCCCGCC 4800 

GAACATGAGC CGCGACAGGA ACCGGTAGGC GAACTCCACC GAGGTCTCCC CCTGGGTCTT 4860 

CACGAAGCTG TCGTCGCGCA GCACAGCCTC GAAGGTCCGA AACGTCCCGT CGAACCCAAA 4920 

CACCATCTTT CGGAGGCGCG CGGTCACCGC GACCTGGCTG TTGAGGACGT ACGTGATGTC 4980 

20 GTTCCGGGCC ACGACTAGCT GTTGCTTGCT GTGCACCTCA CAGCGCACGT GCCCCGCGTC 5040 

CTGGTCCTGA CTCTGGGAGT AGTTGGTGAT GCGACTGGCG TTGGCCGTGA TCCACTTTTC 5100 

CATGGTCAGC GTGGGTTGCT GCGTGAGCCG TCGATACTCG TCAAACTCTT TGACCGACAC 5160 

AAACGTAAGC ACGGGGAGGG TAAACACAAC AAACTCCCCC TCGCGAGTCA CCTTTAGGTA 5220 

GGCGTGGAGC TTGGCCATGT ACGCGCTGAC CTCCTTGTGG GACGAGAACA GCCGCGTCCA 5280 

25 CCCCGGAAGG TTGGCCGGGT TGGTGATGTA ACTTTCCGGG ACGACAAAGC GGTCCACAAA 5340 

CTGCATGTGC TCCTCGGTGA TGGGAAGGCC GTACTCCAGC ACCTTCATGA GGTTCCCGAA 5400 

CTCGTGCTCC ACACATCGCT TGTTGTTAAT GAAAATGGCC CAGCTGTGCG AGAGGCGCGT 5460 

GTACTCGCGT AGGGTGCGGT TGCAGATGAG GTACGTGAGC ACGTTTTCGC TCTGCCGGAC 5520 

GGAGCATCGC AGTTTTTGGT GTTCGAAGGT GGACTCCAGC GAGGCCGTCT GGGTCGGCGA 5580 

30 CCCCACGCAC ACCAGCACCG GCCGCAGGCG GCCCGCGTAC TGGGGGGTGT GGTACAGGGC 5640 

GTTAATCATC CACCAGCAAT ACACCACGGT CGTGAGTAGG TGCCGCCCCA GGAGCCCGGC 5700 

CTCGTCGATG ACGATAATGT TGCTGCGGGT GAAAGCCGGC AGCGCCCCGT GTGTGACCGA 5760 

GGCCAGGCGC GTGAGGGCAC CCTGGCCCAG CCCCAAAGTC TGCTCTAGGG CGGTGAGGGC 5820 

GTGGAACTCG TTTCGCGCGT CTTCGCCCCC GTGCGCCGCC AGGGCCCGCT TGGTGATGTC 5880 

35 GAGGATCACC TCCCAGTAGT ACGTCAGGTC TCGCCGCTGC AGGTCTTCCA GCGAGGCGGG 5940 

GCTGCTGGCC AGGGTGTACG GGTGCTGCCC CAGCTGGGCC TGGACGTGAT TCCCGCGAAA 6000 

CCCGAACTCG TGAAAGATGG TGTTGATGGG TCGACTCAGA AACGCCCCCG AGAGCTTAAC 6060 

GTACATGTTC TGCGCCGCGA TTCGCGTGGC GCCCGTGACC ACGCAGTCCA GGACCTCGTT 6120 

GAGGGTCTGC ACGCACGTAC TCTTTCCGGA TCCGGCGTTG CCGGTGATGA GATACGCCGC 6180 

40 GAACGGAAAC TCCCGGAGCG GCAGGCCGGT CGGGACCTCC AAGGCCGCCA CGTCCCGGAA 6240 

CCACTGCAGG CGCGGCACCT GCGTGACGTC GAGCTGCTGC TGCGAGAGCT CTCGGATGCG 6300 

TGCGATGATT GGTTGGACCC CGTGCATGGA CGTAAAATTT AAAAACGCCT CGTCCCTGAA 6360 

CCGCACGGCG GGTCTGGCCC CGGGCTGCTG TGGGGGCGGA CCTGGTGCCC GGACGTCCCG 6420 
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CGAGCCCTCC CCGCCGGACG CCGCCATGGC 
GACGCGGGGC GGCGACGCGG CGCTATGCGC 
CACCCCCGGG ACGATGTTGT TCCGCGAGAT 
TCAGGGGGTG TACAACGTCG TCCGGTCCAG 
5 GATCTTCCAC GCGCTCCTCA ACGCCACAAC 
CCACGTGGTG GCCCGCGGCC TCCAGCCGCA 
GGAGGGCGAT ATCGCCGGGG TGGCCGAGCG 
GACGACGCTG CTGGACTTTG CCCACGGGGT 
CGGACCGACC AGCTTCCCCA AATATATCGA 

10 ATTGCGCAAG ACGCGCGAGG GGGAGGCGAC 
CACGCTGCCC CGGCAGCTGG CCACGGTCGC 
TCTGGAGCTG GCCGTCGCGT TCGACTCCAC 
CTACTACAAC CATCGCCGGG GGGAGTGGCT 
CGAGTGCCTG GTGCTGTGCC CCCCCCTGTG 

15 CGTTCAGCGG CTGTGCCCCG AGATCGTCGC 
CTGCCGTCTG CGCAACACCG CGTCCGTCAA 
GCGCGGGGTG GCTGGCGCCG CGCGGGTCGT 
GAAGGCCGGC TCGGCCGCCT CGCGTCTCGT 
CCACGTGGGC GACATCAACG ACACGGTACG 

20 GATCGACACC CCCGCCGTCG ACCACACCCT 
CGGGTCGGCG GCCCAGGACC CGGGGGCGCG 
GGCCGTGGTC AACAACATCA ACGGCATGCT 
CATAGAACGC CTGCGAGAGA CGAACGCGGG 
CGAGCTGCGG CGCGCCCAGG CGGGGGCGCT 

25 GGCCGGGGGA GGCGCGGGCC GCCCGGCGGA 
TATCGACGTC AGCAAGTCCA TGGACGACGA 
GTACATCCCC GCGTACGGCC AGGACCTCGA 
GGTGCGCTGC TTCAAGATTC TGCGCCACCG 
GTACTCTAGC GGGGCGATCG CCTCCTTCGT 

30 CCCCCGAGCG GGCGCGCTCA TCACCGGCTC 
GGAGGCGGTC TTTAAGAAAA CCCGCCTGCA 
CGTGGCGGAC GTACAGCACG CGGCTCTGCC 
CCGGGCGAGC GCGTCCCCGC GGGGCGGGTC 
GCCCGGGAGA ACGCCGAGGG GTGCGCCGGA 

35 CCGACCCCAC GCCCGCCGAT GAGGGAACGG 
GGGACCGCAG TCTGGTCGAG GTGGCGGAGG 
CCTGCGAGGT GCGCCAGGTC AGCGATCGCC 
GCGTTGACGT CACCCCCAGG GGGCGGTTGC 
CGTACGTGGC GTCGGAGGAT TACTTTAAGC 

40 TTGCGGTCGT CGTCCTCACG GCCAACGAGG 
TCGTTCTGCT GCACCGGCTC TCCTTGTTTC 
TCTGCCTGCT GATGTACCTG GAGAACTGTC 
TCAAGGTGTC GGCGTGGTTG GGGGTCGTGG 
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8880 


CCCGGAGCCA 


CGCCACGCCC 
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GCTGCCTTCT CCTCCGCAGC TGCCACTGGA 
TGAAGCCCTT CGACGACGAG CTAGTCCTGC 
CCAACAATCC GCCCCCCGTC CTCTCGGCCC 
TGCAGTTGCC CGGGCCCGTC CCCCGCACGG 
5 TGGGAAGCTG CTGGAAATCC AAGGACCTGC 
GGAGCCCCAA ACGACGGACC TCGTCGCTTT 
ACGTGTTTTT TATGGAACGT TCCCTACCTG 
CCTGTGTGTG TGTCTTGTGC ACCGAAGGAG 
GAAAGACATG ATAGAGGGAA CAAAGAAATA 

10 AATACGGACG CGCGCACACG CGGGGGTAAG 
AATTCAGGGA AACAGAAACC GCATCTTTTC 
GCCCCACACG CCTTCCAGAA CCCCCGTAAA 
CAGGTGATGG GCGCAGTCCA CGGGGGGGAG 
GACGCCGACC GAGTCCCCGC CCCCGGGACA 

15 CGCGTCCAGC AGGGCGCCTC CGCGGAAGGC 
CGGGGGGGTC AGAACGCTCC AGTACTCCGC 
AAAGCGGTCA CAGGCGTCCT CCATGATGCC 
GCCGGCGGCC GGCCGCCGGA GGATTCGTCT 
GGCGTACGCG GGCCCGCGGA GAGGAAATCC 

20 CCAGAACCAC GCCCCGGTCT GGCTCCAGGT 
CAGGGAGGGG GCGAGGCGCG GGCGTATGCC 
CTCGAGGGCC CGGCGGGCGT CCTGGATCGC 
CACGTTGAAC AGCCCCCAGA ACGCAGCCCC 
GCTGGCCGTC TGCTCGATCT GCAGGCAGAC 

25 GAGCGCGGGG CAGGCGTCGC ACGCGTCCGG 
GGGCTCCGAG GGGGCGGCCG CCACCAGCGC 
GGCTTCCGGC AGCCCGGCCT CCCCGAGGCC 
CCCGGAGAAA CAAAACCGCG CCGTCCAGAC 
TTGGATGGTG GTGGCCGTGG GGTGCCACCG 

30 GCGGCCGGCC GCCTCCGAGG CCACGGCCGG 
GCCCACCGCG GGCCAGGCCC CCAGGCACGC 
GTCACGCGCC GACTCGGCGG CGGCGGCGGC 
TCCCAGCACG GCAAAGTATT GGACGGGCCC 
AGCGAAGACG GGGGCCAGGG CTCCGGGGGC 

35 GAGGGCGACC AGCGCCGGGG CGGAGAACCC 
GCGCGGCAGC AGCACCCGCG CCGTGACCCG 
CTCGCACACC TCGACCAGGT CCGCGAAGGC 
GGTGGTGTAT TCGCGCGCGA AACGCGCGGT 
GGAGGACGCG CACTGGGGGC TGTCGCGAAT 

40 GCCGGGGTGC TCGGCGACGC GCGCGGCCAG 
CACGTCCAGG AGGGCGGCGC GAGGAGCGGC 
CACGACCAGA CCCGTCTGCG GGTCCCAGCC 
CCCCGTCTGG CGCTCCAGGG CCGCCAGAAC 
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GTCCCCCAGG GGCTCCAGCG GGGAGGCGGC CGGGGCGGCG CGGGCGGCCG CGACGGCCCG 11640 

GCGGCCGAGA CGTCGGGGGA GCCGTAGAAG TCCTGCAGGT CGGACGAACC AACGGACACC 11700 

TCCGCAAAGC GCGCGCGCGC CTCCCCCGCG GCGTCGCGAC AGACCAGATA CAGCAGGGCG 11760 

TGGAGGCAGT CGCGCGTGCG CGGGGGCAGC CATACCGCGT ATAGGGTAAT GGCGCTGACG 11820 

5 CTCTCCTCCA CCCAAACGAT GCCGGGGGCT TCCATGCCAC GACGCCCGGG GGTTGCCGTG 11880 

TATCGAACGA GCGCGGCCCC AGACTTATAG GGTGCTAAAG TTCACCGCCC CCTGCATCAT 11940 

GGGCCAGGCC TCGGTGGGAA GCTCCGACAG AGCCGCCTCG AGAATGATGT CAGTGTTGGG 12000 

CTGGGCGCCG GAGGCGTGCG TGCGCAAGCA GCGCCCCCAC GCGGGCGCGC GCAGCTTGAA 12060 

GCGCGCGCCC GCAAACTCCC GCTTATGGGC CATCAGCAGC GCGTACAGCT GTCTGTGCGT 12120 

10 CCGGCAGGCG CTGTGGTCGA TGCGGTGGGC GTCCAGCAGC TCCACGATGG CTCGCTTGGT 12180 

GAGGTTTTTA ACGCGCCCCG CCCCGGGAAA CGTCTGCGTG CTCTTGGCCA GCTGCACCCC 12240 

GAACAGTTCG CCCCAGATGA TCTTGAACAG CGACAGCGCG TGCTCCGTCT CGCTCACGGA 12300 

CCCGCGCGGG GGGCAGCCGC TCAGGGCGTC GGCCACGCGC TTAACCGCGT CCTCCGACAG 12360 

CAAGGGGCCG TCGGTCACGT TACAGTGGCC CAGTTCGAAC ACCAGCTGCA TGTAGCGGTC 12420 

15 GTAGTGGGGG TTCAGCAGCT CCAGCACGTC CTCGGGGCTA AAGGTTCGCC CCGACCCCCC 12480 

GGCCATCGAG TCCCACTGCA GGCACGCGGC CATGGTGCTG CACAGACGGA ACAGCTCCCA 12540 

GACGGGGGCG ACGTTTAGGG TGGGGTGTAG GGCCACAAGC TCCAGCTCTC CGGCGGCGTT 12600 

GATCGTGGGG ATGACGCCCG TGGCGTAGTG GTCGTAAAGC CGCCGGAAGA TGGCGCTGCT 12660 

ATGGGCGGCC ATGGGGACGC GAAGACAGGC CTCCAGCAGC ACCAGGTAGA TGAACCGCGT 12720 

20 GCGGCCGACC AGGCTGTTGA GGCCGCGCAT GAGCGCGACC ACCTCGGCCG GCGCGACGTC 12780 

CGGCCGGAGG TACTTTTCGA CGAAAAGGCC CACCTCCTCC GTCTCGGCGG CCTGGGCCGA 12840 

CAGGGACGTG TCGGGGTCCT GGCAGCGCAG CTCCCGCAGA TCCCGCTGGG CCCTCAGGGC 12900 

ATCAAAATGT ATCCCCCGCA AAAACAGACA AAAGTTCCTC GGGGTCAGCG CGGCGTCGTG 12960 

GCCCCAGAAC CGCACGTGCA TGCAGTTGAG GGTCAGAAGC ATGTGGAGGA TGTTAAGACT 13020 

25 GTCCGCGAGG CACGCCAGCG TGCACCTCTC GAAGTAGTGC TTGTACCGGA ATTTGCTGTA 13080 

GATGCGCGAC CCCCGCGCCT GCGCCGCGTC GGCGTGCGAC GCGTCGCAGC GCCCTTTGAA 13140 

CCGGCGGCAC AACAGGTTCG TCACCTGGGA AAACTGTGCC GGCCACTGCC CGCTGGCGCT 13200 

CACCACGTGG TTGAGCAGCA TGGGCGTAAA GACGGGCTCC GAGCGCGCCC CGGACCCGTC 13260 

CATGTAGATC AGCAGCTCCC CCTTGCGGAG AGTCCGTACC CGCCCCAGCG ACTGGTACAC 13320 

30 GGACACCATG TCCGGCCCGT AGTTCATGGG TTTCACGTAG GCGAACATGC TGTCAAAGTG 13380 

CGGCGGATCG AAGCTAAGGC CCACCGTCAC GACCGTTGTG TAGATGACCA CCCGGTACCG 13440 

GCCCCATGTG GTCACGTCGC CGGGCGGGGT GAGCGAGTGG AGCAGCAGCA CGCGGTCCGT 13500 

AAACTGCCGG CAGAACCTGG CAACGACCTC CGCGAAGGAG ACCGTCGACG AGAAGATGCA 13560 

GACGTTATCT CCGCCGGCCA GGCGCGCCTC CAGCTCCCCG AAGAAGGTGG CGTCCGGGGG 13620 

35 GGCGTCCGGG GGGGGCGCCC CGCCCGCCGG CCCCGGCGGG CGCAGGGCCG CCTGCAGGAC 13680 

CTCGGGCCCC AGGCGCGGGA GAAACAGACA ACGGCGCGCC GAAAATCCGG GCATGGCATA 13740 

CTCCCCGATG ACCACGTGAA CGTTCTTTTC GCCCCGGAGG CTGCACAGAA AGTCCACCAG 13800 

CTGCGCGTTG GCGGTGGCGT CCATGGCGAT GATCCGCGGG CACGTGCGCA GCAGGCGCAG 13860 

CATCAACGCG TCGACGCGGC CCAGCTGCTG CATCGTCGGC GAGTACAGTT GGCCCAACGT 13920 

40 CGACATGACT TCGTCCAGGA CGAGCACGTC GTAGTTGTTC AACAGGTTCG GGCCCACGCG 13980 

ATGAAGACTT TCCACCTGCA CGATGAGACG GTGGAAGGGG CGGTCGTTCA TGATGTAATT 14040 

GGTGGATGAG AAGTAGGTGA CGAAGTCGGG CAACCCTGAC TCAGCGAACC GCGTCGCCAG 14100 

GGTCTGAGTA AAACTCCGAC GACAGGAGAC GACCAGCACA CTCGTGTCCG GAGAGTGGAT 14160 

531 



WO 98/20016 



PCT/US97/20016 



CGCTTCCCCC AACCAGCGGA 
CACAGTTACG CACCGGGCCG 
CTGCCGCTCG ATCGTTGTTT 
GTAAAGCATC CGCGCCAGCG 
5 GGGACGCCGG GCCCCCAGGG 
GGCGCGGGCG GCGTGGTGGG 
CTGCGTCGTG GGGCTCCTGG 
CGCCCCCGCG ACCTCTTATG 
CGTCCCCCTC CGGTTGGACA 

10 GTTGCTGGCG GCGGCCGTGT 
GCTGGATGCG GCCCGTCGCC 
CGCCGGAAAC GTCTGCGCGT 
CAGCCAGCTG GCCCACCTTA 
CCATTTTTGC ACCAGGGGGG 

15 TGACCCGGCG CCGACGCACC 
CTTATTACTG GGCACCCTCC 
CGCCCTCAAC TTCAACTTTT 
CCTGCTTGTC GTGTCGCTGT 
GTTGGTGGGC CCCCACCTCG 

20 GCACTACCAC ACCGGTGGCT 
AGTCCGCGTC GCCCTGGCGC 
CACGCGCGCC TACCTGTATC 
CACCCGGCAC CGCGCCCATT 
GCGTGGCGGG CCGCCCGGAG 

25 CCACGCCGAG ATCGACCGGT 
CCCCGACCAC GAGGCCGAGC 
CGAGCCCATT TACGACACCG 
CAGCACCGTT CGGCGATGGT 
CACCATACTT CGGCGCGCGC 

30 CTTCCTTTTC TTTCGGCCAC 
CACATACGAC CAAATACGGA 
GTGAGCGTGG CAGGAGGGCG 
GGGTCCGATG CGCGCCGGTA 
GGCACGTAGA AGTTACCCTC 

35 GTCAGCGAGA CGACCTCCCC 
CGCGCCCCGG AGAACGCGAG 
TGTTTCGCCG GATGTCCCGG 
CGATCGGAAC GGCCTGGTCC 
GCCCGGCGGG CGCTCCGCGG 

40 CGCGGTGCCT GCCGAGGAAC 
TGTCGAGGAC GTATCCCTGC 
AGATGGGCTC GCGGCGAACC 
GCTCGAGGTC GGGGACGCCA 



TCAGCGCGGT AGTTTTTCCC 
TCGGGGCGCT CGCGTCCGGG 
TCGGGTGGAC CCGGGGAACC 
ATACACTCGA CGTGTACTGC 
GATCCCCCGA GGCCGCGCCG 
TCTGGTGTGT GCAGGTGGCG 
TGCTGGCCTC TGTGTTCCGG 
CGGAGGCGAA CGCCACGGTC 
CGCAGAGCCT GCTGGCCACG 
ACGCCGCGGT GGGCGCGGTG 
TGGCGGCGGC CCGTATGGCG 
GGCTGTTGCA GATCACAGTC 
TCTACGTCCT GCACTTTGCG 
TCCTGAGCGG GACGTACCTG 
ATCGTATCGT CGGTCCGGTG 
TGTGCACGGC CGCCGCCGCG 
CCGCCCCGAG CATGCTCATC 
TGTTGGTGGT CGAGGGGGTG 
GGGCCATCGC CGCCACCGGC 
ACTACGTGGT GGAGCAGCAG 
TCGTCGCCGC CTTTGCCCTC 
ACCGGCGACA CCACACTAAA 
CGGCGCTTCG ACGCGTACGC 
ACCCGGGCTA CGCGGAAACC 
ATGGGGATTC CGACGGGGAC 
TCTACGCCCG AGTGCAACGC 
TGGAGGGGTA TGCGCCAAGG 
AGCCGTTTCG TTCGTTTTAA 
GTGTGTGTGT TTTTTTTGTG 
CACCCCCCTC CTCCCCCGTA 
CAATCATTTC TGTCTTTATT 
GGCCACGTCG GGGTCCCGCC 
CTGGGGCCCC GGCGCCCGGG 
TTCTTCGGAC TCGATGTCCA 
GCCGTCGGTG ATGATGACGT 
GCCCATAACT TGGCGAGCGT 
TAGATCCCCG GCTCGACGCG 
GGGAGGATCG ATGCCTTGGC 
CCGTCCTCCA GGCGGAACGT 
GTCACCAGGT GCGGTTGCAG 
ACCAAGATCT GTTTGAAGTT 
AGCTCCCCGG AGCTCCAGGC 
AACAGAAGCA CCTCCGAGAC 

532 



GAGCCCATTG 


GCGCGCGGAC 


14220 


AAGGTGACGG 


GTCCGTGTTG 


14280 


CACTCGGCCA 


AATCCCCCCC 


14340 


TCGCACTCGT 


CATCCCCGAT 


14400 


GGCGCCGACG 


TCGCGCCCGG 


14460 


ACGTTCATCG 


TCTCGGCCAT 


14520 


GACAGGTTTC 


CCTGCCTTTA 


14580 


GAGGTGCGCG 


GGGGTGTAGC 


14640 


TACGCAATTA 


CGTCTACGCT 


14700 


ACCTCGCGCT 


ACGAGCGCGC 


14760 


ATGCCACACG 


C C ACGCTAAT 


14820 


CTGCTGCTGG 

X W\« X \JV X WW 


CCCACCGCAT 


14880 


TGCCTCGTGT 


ATCTCGCGGC 


14940 


CGTCAGGTTC 


ACGGCCTGAT 


15000 


CGGGCAGTAA 


TGACAAACGC 

x wriwruMtv* wv 


15060 


GTCTCGTTGA 


ACACGATCGC 


15120 


TGCCTGACGA 


CGCTGTTCGC 


15180 


CTGTGTCACT 


ACGTGCGCGT 


15240 


ATCGTCGGCC 


TGGCCTGCGA 


15300 


TGGCCGGGGG 


CCCAGACGGG 


15360 


GCCATGGCCG 


TGCTTCGGTG 

X X X CWw X 


15420 


TTTTTCGTGC 


GCATGCGCGA 


15480 


AGCTCCATGC 


GCGGTTCTAG 


15540 


CCCTACGCGA 


GCGTGTCCCA 


15600 


CCGATCTACG 


ACGAAGTGGC 


15660 


CCCGGGCCTG 


TGCCCGACGC 


15720 


TCCGCGGGGG 


AGCCGGTGTA 


15780 


TAAACCGACG 


TTGTGCGTTT 


15840 


GTGTTTATTT 


TCCCCCACCC 


15900 


CTATACAACA 


AAAAATACCA 


15960 


CGCTGTCAGA 


GAGTGGGGGC 

\J£*\J X WWWWWV 


16020 


GTCTGGTGTG 


ACGCGATGGG 


16080 


TGACCACGCG 


CATGTCGGGG 


16140 


CGACGTCAAA 


TTCGTGGGCG 


16200 


TGTGTCGGCA 


GCAGCAGGGC 


16260 


ATCGTCGAAG 


GCCAGGCGGC 


lb Jz U 


GACGGGGGTG 


ATGATCAGGG 


16380 


GGGTCCGGGG 


GCCCCGCCAC 


16440 


CACGCCCTCC 


TCCGCGCCCG 


16500 


GGGGCAGTCG 


GGAAAGTGGC 


16560 


CGGGTGGCGG 


GGGTTGGCGA 


16620 


CACGGGAGAG 


ATGGTGCGAC 


16680 


AACGCCGCTA 


TTTAACTCCA 


16740 
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CCAGCGCCCG ATCCGGGGCG 
AGTCCCGGTC TTGGGTGACG 
AGCGCACGCC GGGGTTGGGG 
GCGCCATCAG GTCCTCGTAC 
5 CGTACTTGGC TCGGCACTTA 
GGTAGCCGTG AGGGTCCCTG 
TGTGGCCGTC CATGAGGACC 
GGTGAAAGAC GAAGCGCCCG 
CGCAGTAGCG AAACAGCAGG 

10 CCGACGACTG GGCGTCCAGC 
ACGGACCCTG CGCGCCCCAC 
CCCAGAGCTG GCAGTCGGCC 
GGGGGGCGAC GGCTTCGGCG 
GGCCGAGCCC GCGGTCCACC 

15 AGTCCAGGCG AGCCCACAGG 
CGCGCAGGTG GCGCTCGAAC 
CCGACGCCGA CCACATCGGG 
CGTGTGCGCC CCCCGGCGAG 
AGAGGGCCGG GGACGCGGGC 

20 GTGGGGGGCT CTGGGGCCAA 
CGGGGCGGGG CCCAAAGACG 
GACTATCGGG GTCGCGGGCG 
CCATTTTTAC GAGCAGCCGA 
CGCGCGGCCG GGTTGGCGTG 

25 GGCGGCAGCG ACACCGACGA 
TCCGTGAACG CGCGCCGAAT 
TCATACGCCA GGCCGTGGGT 
ACGCAGCGAT AGGCGAGGAG 
TGGTAGCCCG GGACGCGGGT 

30 AGCAGCTCCA GCAGCGTCTG 
TTCAGGGGGC GGTTGTTAAA 
GGCGGCTGGT TGTACCCGTG 
GGCATCCCAA ACCCCCGGGG 
GATATCGTGG AGTTGGAGTT 

35 AGCGACACCG CGTCCGATCG 
CTGATCCCGC ACCTGGTGTT 
TGGAGGGCCG TCGCGACGGA 
TTGCCGAGGT CCATGTCGTA 
AGCGGGGTGA TAAAGCCGCG 

40 ACGAGCAGGG TCGCGACGAG 
GCGAGCTTGT GTTCGCGAAT 
CGGGCCCCGG GGATCTCCAG 
ATGCATAGCT TGTGGATGCG 



GAGCATCGCC TTTTTTCGCC 
AGCGCCTCCT CCGGGCCCGG 
ATGGACCGGA TGAACGCCCG 
GCGGAGGCCG CGGGGGCGCC 
ACCTCGTAGA AGGCCAGGGG 
GGGCACACGA GGATGTCCAG 
CCGCACGCGT GCACGTTCTC 
GCGTCGGCGT CGTCGTTGAC 
TTTCGGGCCG TCGGCTCGTT 
CGCAGGCTGG CGTTGTGGGT 
CGCAGCGTGG AGGCGGTCGT 
TGGTTTTGCG TCGCCGCCTC 
GCGGACGGGG GGGCGCGGCG 
ATGCCGGCCG CCTCCAGCGA 
GGCCCGATGG CCAGAGGGGA 
GTTTCCGCCA AGATATGGGG 
TCGGGGTCCG GGGGACCGGG 
AGGGGAATGT CGGGGGTTGG 
CGGGCCTTTT CGCCCGGGGC 
TGGGAACCCG GGGCCCCCGG 
GTCGCCAGAT CTAGGCTGTT 
GGGTCCGCGG GGCGCTTGGC 
AGAGCTCGAG GGCGGAAGGG 
ACAGAGGCGG GAGACCAGCA 
CAGGACGGCC TTGTGCGTGC 
CTTGGGATTG CGAAGGTGGC 
GTTGGTCTCG GCCGAGTTGA 
GGCCACGGCA AAGTCCGGCG 
CACGGGGACG CCCAGGCTCG 
CCCCAGGGCG TAGAGATCGA 
CTCGGCCCGC TCGTTGTTGA 
CCCCACCAGA GTGTGAAAGT 
GGACTCGAGG TCCGGCTCCT 
CAGGGTCACC AGGCTAAAGT 
CAGCATCACG AGGACGTTGG 
CAGGAACACC ACGGCGCGCG 
GGGGGTGGTC GCGCGCAGGG 
CGCGGGGAAC ACGATCTGGC 
GATGTCGTGG GTGCGGCCGC 
CTCCACGGCA AACCACTCCT 
CAACTGCACC TCGCCGTACC 
GGTCGTGTAG CGGAGGGCGG 
CGCGAGGGAC AGGATGTGCG 
533 



GGCGGCGCGG 


GAATCGAGCC 


16800 


GACGCGCCCG 


GGCGCGAAGT 


16860 


GAACGCCTCC 


GGCGATCGCC 


16920 


GGGGTCCGCG 


GGGTCGAACG 


16980 


GGTCTGGGGG 


GCGGGGGCCA 


17040 


GGACGCCCCC 


ACCATGCCCG 


17100 


CTCGGCGAGG 


TCCCCGGGTT 


17160 


GCCCGCGTCC 


GCGCGGCCCA 


17220 


CACCCGCCCG 


AACATCACCG 


17280 


GAGCCACTGG 


GACGAGAAGC 


17340 


CAGGCCCCGC 


CGAAGCAGGG 


17400 


GTAAAATCCC 


ATAAGCGGGC 


17460 


CGTCAGGCGC 


CAGAGGTGCC 


17520 


CACGACGAGG 


GAGCACAGAT 


17580 


GCGGACGCCG 


CGCAGPAGGC 


17640 


GGGCAGTGCG 


TTGGGGATCG 


17700 


GCTGCAGTCC 


GGGTCGATGG 


17760 


CGGGCCGGAT 


GAGGCC TC AG 


17820 


CCCGCCGTCG 


GGTTGCCCAC 


17880 


TGACGTGGGG 


CGGGGTGGGG 


17940 


GGGTCGGGGC 


CGCTTCGGGG 


18000 


GCCGGGTGTT 


GCGGCGGCCG 


18060 


ATCCTCACGA 


CAGAGAGTGG 


18120 


CCAGCAGCGG 


CCTCAGCTCG 


18180 


GCTGGTAATT 


TATACACTGC 


18240 


GCCGGATGCC 


CTCCGGCACG 


18300 


CAAAGAGGGC 


GGGGTGCAGA 


18360 


AGAGCTGGTT 


GTTAAAGTAC 


18420 


GGGCCACGTA 


CACGCTAACC 


18480 


CCGCCAGCCC 


GACGTCGTGC 


18540 


GGTACTTTAC 


CAAGAGCTCC 


18600 


TGGCCGTGGT 


CAGGGCGGCG 


18660 


GGAGGCAAAA 


CTGGCCCCGG 


18720 


CGGCCAGGAC 


GGCCCGCCGG 


18780 


CGCACTTGAT 


GTCCAGGTGG 


18840 






1 Qonn 

j.by uu 


ACGCCAGCTG 


GCCGATGTAC 


18960 


GCTGCTGCAG 


CGAGAACCCG 


19020 


CGCGAAGAGC 


GCACTCCCCC 


19080 


TTTCCCGGAT 


GGTCTTCACG 


19140 


CCCCCGAGCC 


CCCGAAGCTG 


19200 


GGTTGACGGC 


GAATACGGGG 


19260 


AGGGGGGCGA 


CGGGGGCGAG 


19320 
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GTCATGGCCG TCTCGGACCT GCGCAGGGGC 
GCCTCGGGGG ACGAGCGGCG ACGAGACGAG 
CGAAGCCGCT CCCGGAAGCT GGATCGGCGG 
CCGTCTCGGG GGGAGGGGCC GCTTGGGCGT 
5 TAGGACGCGA GCCAGGCCTT GAAGGAGCGT 
GCCACATGAC TAGCAGGTCG CTGTCGCCCG 
CCCCCCACAG AGACGCGTTC GCCGCGGCCT 
GATCGTCCGC CGCGTCCAGG CGCTCGCTAA 
TTAGAAAATC ACGTCGCGCC GCTTGCTCTT 

10 GTCGCATCAT CTCTAAGCGC GCGCGGGACT 
CCTTGGCGGC CATAAAGGCG CCAACAAACC 
GTAGCTGCAG GGTCTGGTCC CTGTACACCT 
GGCGCAGGGC CGCGTGGCTG GCGTCTCGGC 
TCGGACTCCT TCGCCCCGAC CCCCCTGACC 

15 GGCCAGCAGC TGGCGTCCGA CGTGCAGCAG 
CAGAAGGTGG GCGTCGACGA GGCGTCGGCG 
GTCCCTTTTT TGGATTTTGC CACCGCGACG 
GTCGGGACGC TCCACGACTG CTGCGAGCAC 
TTGCTGTTTA ATAGCCTGGT GCCGGCGCAA 

20 ACGGCCAAGC TGGAGTTCCT GGCCCCCGAG 
CGGGAGTGCG CGCCGGAGGA CGCCGTGCCC 
ACGTTTCAGG CCCTGCACCG CTCCGAAGCC 
TTCGCCCAGT TGTTGAAAAC CTCGTTCCGG 
CCGAAGAAAC GGGCCAAGGT GGACGTGGCC 

25 CTCTTCCAGA AAATGATACT AATGCACGCG 
GACCACGCGG AGCAGGTCAA CACGTTCCTG 
GACACGGCCG TGCGGCACTT CCGCCAGCGC 
GGAAAGACCT GGTTTTTGGT GCCCCTCATC 
AAGATAGGCT ACACGGCCCA CATCCGCAAG 

30 GCCTGCCTGC GGGGCTGGTT TGGCTCGTCC 
TCGTTCTCGT TCCCGGACGG CTCGCGCAGC 
AACGTAAGTA CGCCTTCCTC CCGCGGTGCC 
GACCGACAGA CAAACACAGC CAGACGCGAG 
CCATGGCGGG GGGAAGCCTT ACTGTTTATT 

35 CCCGCGCGAC CGCGGGGCAG CTCGTTGCAA 
AGAGGCGCCA CCCGGCGCTG GTCGGGCGGA 
CGACCTCGTG CAGGTGGGCC GTGATGCGCG 
CGTCCACGGG GTGCCCGAAG AGGAGCTGAC 
TGCGCTGGGC CATATTGGAC CACATGCACG 

40 CGGGGGCGCG CCACAGCGCG TTGGCGGAAT 
CTCCTCCCGG GGGGTCGGTA ATCCTGGATA 
CCGGGGGACA GAGCGACCCC AGGTCATCAT 
GGAGGTGCCA CCAGGCCCCC GGACCCAGGG 
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CGGCCCACGA 
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GGCGGCGGGT 
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ACAGGCTCGC 


GTCCCCCCGG 


ACGGCCAGGG 


21600 


GGGCGACGCA 


GGGACAGGCC 


TCCGCCACGG 


21660 


CGATGTGGGC 


CGTCGGGGCG 


CAGGCGCCGC 


21720 


GCAGCCATCC 


TAAATGGCGG 


GCCCGGCTGC 


21780 


CCATGGCCCA 


GCAGTATATG 


CGGCCGCCGG 


21840 


CACAGCACGC 


CCCCGGATTC 


GGGGGCGGTT 


21900 
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CCGTGGGTAC CAGGTAGGCG CCGTCGAGCT CGTGGGCCAC GGGCTCGTCC GCGAGCTGTT 21960 

CGGCGGCGGG GTCGGGGGTT TCCTCCGGGG GGGAGGCAGC TTCCAGGTGG CCGAAGGCTA 22020 

GGGTGCACAG CAGCGGGGTC CGGGGGTGCG TTACGCTGCG GAGGTGGACG GTGGCGCAGT 22080 

AGCGGCGCTC GCGGTTAAAG AAGAAAATGG CAAAGAACGT GTTCGAAGGC AGGCGCAGCG 22140 

5 CCTTGGGCCG CGTCAGGTAC AGGAAGATCT CGCAGAAAAG GGCACGCTCG GGGTCGGGGT 22200 

CCGGAAGGGC CACCTGGCAC AGCGGCTCGG TGAGGACCGT GAGGCACCGA AAAATCTTAA 22260 

GCCGCTCGTC CCCCCGAACG ACGCGCCACA CGAAGACAGA GTTGGCGATG CGCGCGACGA 22320 

GGTCGGCTTC GGGCCCCGGG TCGGGGGCGC GCGCGTCGGG GGGGGCGCCC CGGTGACCCG 22380 

GCGGGGCCGC GGCTCCCGGG GGGCCTGGCG TCGCCTGGGG ACGCCAGAGT GCCCGCTGTG 22440 

10 CCAGGTTGGT GGTGGGGAAG GGACCGGAGA CGCACCAAAA GCAGAGGGGC CAGCGCGTGT 22500 

ATGAGTTGGG GGGGGGGTGG GTGAGCGGTG GAACAAAAGC ACGCGTCAGC GGACAAGGCC 22560 

GGGTCCCGTA GCCGCCCCGC GACAGAACCG GAGTCCGACG GCACGCGCGA CGGGGTCTGC 22620 

GAGGCTGAGG TACGCCGCGG TGTTAATGGT AAACGCAAAG CCTCCCGGAA AGACCACTAG 22680 

CCCGCAGAGG CGGCGATTGA ACCCAAGGCA GAGGTACGCG TAGCTCTCTC CCGGAAGGTA 22740 

15 TTGCTCGCAG ACCCTGTGCG GGGCAGTGGA GGGGCTGCCC TCCATGAAGC GACATTTACT 22800 

CTGCTCGCGT CCATTGACGT CACCGTCAAT CACCACTGCG ATTGGACGGT TGGTAAGGCG 22860 

CAGCGTGTCT CCGCTGGTGC TGTAGTAGTC AAACGCGTAG TGGGCGTCGG AGTCGGCGAA 22920 

GCGGGCGGGG ATGTCGTCGC TGAGAGGGAC GAGCCGCCGC CGCCGCCCCC GACCGCCCTG 22980 

GCCGCCCAGA TGCGCCAGCA CGGCCAGGGC GTACGCGGTG TGAAAGAACG CGTCGGGGGC 23040 

20 GGTCCCCTCG AGGGCGCGCA TCAGGTTCTC CAGGAGCACG GGGAAGCGCC GCGTCACCTC 23100 

CCCTAGCCAC TCGCTCTGGT GGGGGCCAAA GTCGTAGCGC AGGCGCTGGA AGATGCGCGG 23160 

GCCGCCTTGG AGCGCGGCCC GGATAGAGTG GCCCAGGGCC CGCAGACACG CGATCTGGAT 23220 

GCGCGCGACG AAGGCCACCT CGGCCGCGAT GTCAAAGGGC TGCAGCACGG GGCGCGGGTG 23280 

GCGCAGGGGT CCCTCGAGCG CGGGAAAGCG ACGCAGCAGC GCCGTCTGGG CCGCGGGGGA 23340 

25 CAGCTGGTGG GGGCGCACGA CGCGCTCGGC GGCACAGGCC TCCGTCAGGG CCGTGGCCAG 23400 

CTCGGAGGAC AGCCGCGGGG GGCGGGCGCG TCGCCCGCCC CACGCCACCG AATTCTCGTA 23460 

GGAGACGACG ACGAAGCGCT GCTTGGTCCC GTAGTGATGG CGCAGGACCA CGGAGATGGA 23520 

GCGACGGCTC CACAGCCAGT CGGGCCGGTC GCCGCCGGCC AGAGCTTCCC ACCCGCGGTC 23580 

CAGCCACTCG ACCAGCGATC GCGGCTTGGC GGTCCCCGGC ACGAGGGTGA GCACGTCGTT 23640 

30 GAGGACGTCC TCGCCCGCGG CCCGGGGGCC CCCCGGGGTG GCAAAGCGCC CCCCGCCGGG 23700 

CGGCTCCAGG CCCGCCAGCA CCGCCTCCGC GTCCGACGCG CCCAGGGCTC CCCCGCTGAC 23760 

GGCCTGGTGG ACCAGGGCGC CCTGGCGGAG CCCCGAGGCG ACGCCGGAGG CCGCGTGCTT 23820 

GGGGCGCGCG CGGACCGGGT GGCGGCGGGT GACGTCCTGC ACGGCCCGCT GGACCAGCGC 23880 

GAGGATCTCC TCGTTCTCTT GCGTGATGGA CACGTCCTCC GCGGTGGCCG TGTCGCCTCC 23940 

35 CGGGGCCGTG AGCTGCTCCT CCGGGGAGAT GGGGGGGTCT GGGGTGCCGA CAACGGCCGG 24000 

CCCGGCCCCG CCCGAGACCG AGGACGCCTG GGGAGTGGGG GTGCCGCTTT CCCCCATCCC 24060 

CAGGGACAGG TGGGCCGCCG CCTCCGTCGC GGCGGCGGGA GCCGCGGCCC CCAGCCGCGC 24120 

GACGTAGCGA CAAAAGTGGC GACAGAGGCG CATGAGGCGC GCGCCGTCGG CCGCGTATCG 24180 

CGTGTTTGGC GGGACGAGCT CGTCGTAACT GAACAGGAGC ACGCGGGCGC AGGTCGCCCA 24240 

40 CGGGCCCCAC GCCAGGCGCA GCGCCGCGAC CGTGTACGGG TCGTACACGC CTTGGGCGTC 24300 

GCACGCGACC GGCAGGGAGA CGAACAGCCC GCCCGCGCTG GGGACGCGCG GCAGGAGGTC 24360 

CGGGTGCGCC GGGATGACGG GGGCTAGGAT CGCCCCCACC GCATCCGCCG GCACGTAGGC 24420 

GGCAAACGCC GAACGCCACG GGGTGCAGTC GCCGGTCGCG TGGGCCCGGG TCTGGGTTTC 24480 

535 
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GACCCGGAAG TTCGCGGCCG 
GACCCCCGCC GCCGCCAGGC 
GCTCGGGTCC CGGGTGAGAT 
GCCGGCGGTG CGAGCCCTTC 
5 GTCCTCGGGG CCCGCGGGCG 
GAATCCGGGG CCAGGACTTT 
ATGCGGTCCA GACGATTATG 
CGTCGACCAA CACCGGGAAG 
ACGAGCTGCT CAACGTGGTC 

10 ACACCAACGC CACGGCCTGT 
ACGGCGCCGT TCGCCGGACG 
TCGGGGGGCA GGCCCGCGAG 
AGCGGTTTCT GCTGTACCGC 
TGTACGTGTA CGTGGACCCG 

15 CGGTCGTCGG GAGGTACCGC 
GCGCGCTCAC GGGATCGGCC 
AGGTGCTGGC GCTGCACCCC 
GCAGCCAGGA CTCGGCCGTG 
TGGCCTCGGC GGGGGCCAAC 

20 CCGGCGGCGC GGTATTGTAC 
AATACTTTAT CAAAAAGTTC 
TGACGGTGCG CCTGCAGACC 
TCGAAACCGT CTCTCCCAAC 
CGGACGACCT CATGGTCGCG 

25 CCCCGGCCTT TTTTCCGATC 
CTCTTTCTTT CCCCCCTCTC 
TACAACAGTG TTGTCCGTTG 
AAAACGGTCG GCGAACACAA 
TCGACACACA CCCCCCTTCT 

30 TGCCGGCGCT TTATTTACGT 
CGCCTGCCCC CGCCTCAGGG 
GCGGCGCCGT CGAACGTACA 
TGGCCCGCGT GCGCCAGCCA 
TCCTGCATCA GCATGGGGGC 

35 TGCATCTGCA GGAGCGCGTT 
ATGAGCGTGA GGATGAGGGT 
GACCGCAGCT CGGTGTTTAC 
GCCCCGTTGT AATACAGCAC 
TTGAGGTCCC GGATGCCCCG 

40 GGGAGCGGGA CCGGAAACCG 
TGGCCCTCGA AGACGGGCGG 
GAGGTGTTGC GGAGATTGAC 
ACGATTCGCG TGGGCAGCAC 



CCCCGCCGTC GGGGCGGCCG 
ACTCGCTGGA GATGATGACG 
CGTATTGGAC CTCGTTGGCA 
CCGGTGCCGG AAGGGGCGTG 
CACGTGCGCT TATACGCTGT 
AACCTGCTTT TCGTCGACGA 
GGCTTTCTCA ATCAGGCCAA 
GCCAGCACGA GCTTTTTGTA 
ACCTATATAT GCGACGACCA 
TCCTGCTATA TCCTGAACAA 
GCCGATCTGT TTCTGCCCGA 
ACCGGCGACG ACCGGCCCGT 
CCCTCCACCA CCACCAACAG 
GCGTTCACGG CCAACACGCG 
GACGATTTCA TTATCTTCGC 
CCCGCGGACA TCGCCCGCTG 
GGGGCGTTTC GCAGCGTTCG 
GCCATCGCCA CACACGTGCA 
GGCCCGGGGC CCGAGCTCCT 
CCCTTCTTTC TGCTCAACAA 
AACTCCGGGG GCGTCATGGC 
GACCCGGTCG AGTATCTGTC 
ACCGACGTCC GCATGTACTC 
GTCATCATGG CCATTTACCT 
ACGCGCACGT CTTGAGTCTT 
TCCGCAATAA ACGCCTTCCC 
GTTGGGTGGT TGGGGTGCGG 
CATCGGGAAA ACGGATTCCC 
CCTTAAATAA ACACAAACCA 
CTTGTTTTTT TGCGTTTCCT 
GTAGCGGATA ACCGGGGCCA 
CACCCGAACC GCCGGGGCCA 
GGCGACCAGC GCCTCGTAAA 
TTCGGGGTGG ATGAGCTGGG 
CACGTATCCG TCCTGGGCGC 
GGTTCCTTCG GTTATGGAGT 
GGAGGCGAGT TGCTGGACGT 
GTTGAGGTCG GGGAGCTCCC 
GGCGACCAGC CGCGCGACTA 
CAGCGTGAGG TCCAGCGACT 
GACGAGGCTG ACGGGATCCC 
GGTGCCGGCG TGCGTGAGCC 
CCGCGTGATT ACCGCGGGGA 

536 



CGCACGAGGG 


CGGACAGCGG 


24540 


TGAATCAGCG 


AGGCGGGGCT 


24600 


AAGTGCGCGT 


TCATGGCCCG 


24660 


GGTGGGGGGT 


GCGTGTGCGC 


24720 
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TGTCCCCAGG 


24780 


GGCCAACTTT 


ATTCGCCCGG 


24840 


CTGCAAGATC 


n x \_ ± x vu x v- x 


24900 


CAACCTCCGC 


GGGGCCGCCG 


24960 


CATGCCGCGG 


GTGGTGACGC 

VJ X VJVJ X LVJV-. 


25020 


ACCCGTGTTT 


ATCACGATGG 


25080 


CTCCTTCATG 

v-» x ^ x x v*n x \J 


CAGGAGATCA 


25140 


CCTAACAAAG 


TCGGCGGGGG 

X V— w\J V— wwwww 


25200 


CGGCCTGATG 


GCCCCCGAGC 


25260 


CGCCTCCGGC 


ACCGGCATCG 


25320 


CCTGGAGCAC 


1X11 ILL I LL 




CGTCGTGCAC 






V- v_j x VJUvwu X \ 


a a cncic a a p a 

w/AVjjvjOv, r\r\L. /\ 


95500 


TACCGAGATG 


caccgcatcc 


25560 


CTTC TATC AC 

V» X X L. X X »w**L* 






ACAGAAGACG 


LLLULL x X v_ V7 


^ JDOU 


GTCCCAGGAG 


CTCGTCTCCG 

V- X V_VJ 1 V_ 1 LL\J 


25740 


CGAGCAGCTC 


AACAACCTCA 


25800 


CGGAAAACGC 


AACGGTGCCG 

/iTiViWSJ X wwww 


25860 


GGCGGCCCCG 


ACCGGGATCC 

*iw w uvun x v— v— 


25920 


TCTTGCCGTT 

X \— X «*• Uv«vw X X 


TCTTTTGTTT 

x v» x x xx \j xxx 


25980 


GGAACTGTGT 


TTTCCCCCCC 

X X X w, V- w w. V— 


26040 


GGGTGGGCGG 


GGGAAGCAAG 

vjvj vjxTkXivj v>nn\J 


26100 


GAACGTGCGT 


C TTCCC AG AT 


26160 


CACGCTCGTT 


GGTT'GGTT A A 
X X Vjrvj X X Jin 




CCGCGGGTCC 


CTTCCCAACA 


26280 


TGTCGCCGGA 


TTGCACAACG 

x x wn^nnvw 


26340 


GGGCCAGGAT 


GTCCCCGAGT 

vj x w w v wnw x 


26400 


GCGGCAGCCT 


GCGCTCGCCG 


26460 


CGGCTTCTCG 


CGTGACGCTC 


26520 


TCAGCGCGAG 


CAGCCGGGGG 


26580 


AvjAL.**, A l\3 1 X 


GAGGACCjAGC 


Z004U 


CGGCCACGAG 


CGAGAGACGG 


26700 


CGGGCGTCCG 


GGGGTCGGGG 


26760 


TCTCGCGGGC 


CAGGGGCGTT 


26820 


CCAGGCGCAC 


GTCCGTCGCC 


26880 


CGTTGCAGAG 


GTCGACGGGG 


26940 


CCAGGTCCAC 


GGGGCAGGCG 


27000 


AGCGCCTGCG 


GTACGCCAGC 


27060 
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AACAACCCCA ACGTGTCGGG ACTAACTCCT 
GCGAGCGCCA GCTGGCGGCG GATGGTCGGC 
AGCGCCGCGG CATCGGGGCG CGAGATACCC 
TCCGTGATCA TGGCGCCGGG CCGCGAGACA 
5 AGCGCAACGC AACCGGGACG ATGATGAAAC 
GGGGGCGGGG CAGGGCTCAG CAGCACGCAC 
GTGAGAATCA GTCCCCCGGA GCTCGGGTCT 
CGGCTCCCAG TCCAAGCCCC CCCGGGGGGG 
AACGTCGGAA AATCAAACCC AATGCCCCAA 

10 AGGGAAAGCT GGGGAAGAAG AAGCCAATTT 
TCGTAGATGA GATACTGCGT AAAGTGGGTC 
CTGCGTAGCA GGGCGGGGTC GCTGGCGCAG 
CACGGGTCTT CCACGAGCTC GCGGCACCCC 
GCGGTGGCCG TGGATACCGC CGATCCCGTT 

15 TTGGCCGTGA TGTCGGCCGC GGTGAAGAAC 
CCGTTGAGGT GATAGGCCCC GTTGTACAGC 
CACGGGTTGG CCGTGGCCGC GAAGGGCCGC 
AGGGCTATGA CGTCCCCCTC CTTGTCCCCC 
GGGTTGCAGG GCCGGCGAAA GTAGTTGATG 

20 CACACGGCAT CCTGCCCGTG GTCCATGCCG 
ACCGGGAGGG GCTGGGCCGG CCCCAGCCGG 
ACGGCGGCCG CGTTGTCTAG CAGCGGGGGG 
AGGTTGCCCA TGTCCGTAAC GGGGTTGCGG 
CCCACACCCA GGTCCACGTT TCCGCGCGGC 

25 GTTTCGTGGC GGGCCACCTG GAGCTGGCCC 
AACAGCACGT TCTCGGTCAC GAAGCGGTCC 
TGGAGGCCCG TCTTGAGCTG GTGATACAGG 
ATGAGCGCGT AGGTCAGCGC GTTCTCCCCC 
GGCTGGCGGA TGGAGGAGAA GTAGTTGGCC 

30 CGCGCCAGGT CGCGCAGGGC CGGGGGGAAG 
GCAAACAGCG CGTGGACGGG CAGGACGTAG 
AGGTGCTGGG GGGCCATGAG CAGCACGCCG 
TTGGCCGTCG ACGCGGTGTT GGCGCCCGCG 
GTGCGCTCGG CCATGTTGTG CGCCAGCACC 

35 ACGACGCGCC CGTTGTGGAA CATGGCGTTG 
AGCGGGTGGG CGGGGTCGGT CACGGGATCG 
GGGACCACCA TGTTCTGCAG CGTGGCGTAC 
CAGCGCCCCC GCGAGAAGGC CGGCACCAGC 
CAGTCGGCCG GCCGGTGCGG CCGGTCGTCG 

40 TGCAGCAGCC GGCCGTCGTT GCGGTTAAAG 
TAGACGGGCT CGTGTCCCCC CGCGTCAATC 
TGTCGCATAA GGCCGTCGCA GTCCCACACG 
AGGTGATTCA GCTCGGCCTG AGCCTGCCCG 
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CCGGAGACGA 


ACGATTCGTG 


CGCCACGTCC 


?71 9ft 


AGAAAGACCA 


CTCGACCCTC 


GCACCGCTGC 


771 fift 


GAGGGGATCG 


CGATGTCTGC 


TTCGAAACAA 


27240 


CCGGAACGCG 


GGGGTGCGGG 


AGGGCCGGAA 


27300 


AGAGATGGGG 


GGCACCGACC 


GTGTGGGAGA 


27360 


GGGGAGGTCT 


GTCGTGCGCA GGAGCCCCAG 


27420 


GGGTTTTATT 


GGGACCTGCC 


CTCGGAATCG 


27480 


CGGGGACAGG 


GGGTGTGTGT 


GGGTAAAAGC 




ACAGGAAAAA 


AAAAAAAGAC 


GGGCGGGTGG 


?7fiftft 


TACAGAGACA 


GGCCCTTTAG 


CGGGGAGGCG 




TCTCGCGCGT 


GGGCCTCCCC 


ATCGCGGGCG 


9779ft 


GTGATCGGGT 


AGGCTTCCTG AAACAGGCCG 


6 I I OU 


GGCGGGCGCT 


TAAACTGCAC 


GTCGCTGGCA 


97 QA(\ 


TC C ACG ATAA 


GACGCTCCAG 


GCAGCGATGT 


Z / J\J\J 


TTGAAGCAGG 


GGCTGAGGAC 


GGGCGAGGCC 


c. i y ou 


AGGTCCCCGT 


ACGAGAACCG 


CTGCGACGCC 


9ftft9ft 


GCCGGGTCGC 


TCTGGCCGTG 


GTCGTACATG 


9ft ftftft 


GCGTACACGC 


CGCCGGCCGC 


GCGTCCCCGC 




TCCGTGGCCA 


CGGGGGTGGC 


GATGAACTCA 


9ft9ftft 


GCGCGCCGCG 

VjV- \JV-> vjV- VJ \j 


GCACCTGGGC 


GCAGCCAAAG 


9ft 96ft 


TTTCCCGCCA 


CGACCGCGTT 


GCGCAGGTAC 


9ft'a 9n 


GCCCCGCGGC 


CGAGGTAAAA 


GTTTTGGGGG 


9R1 fift 


ACGGTGGCCG 


TGGCCGCGAC 


GGCGGTGTAG 


9ftA4ft 


TGGGTGAGCG 


TGAAGCTGAC 


CCCCCCGCCC 


?ft son 

^ O J v \J 


AGAAAGTACG 


V_ L_ ILL VJ./-iL 


VjV_\jL. 1 \_L>Ljri/\ 




TGCCGCACGA 


CGGTGAACCC 


GAACCCGGGG 


9ft fi9ft 


GCCACGGGGC 


TCATCTTGAA 


GTACCCCGCC 


?ftfiftft 


GCCGCGCTCT 


CGCGGGCGTG 


CTGCACCACG 


9fl7dft 


CCCAGGGCCG 


GGGGGACCAG 


GGGGACGTGG 


9ft ftftft 


TTGGGCGCGT 


TGGCCACGTG 


GTCGGCGCCC 


9ftftfift 


AAGTATTCGC 


CATTTTGGAT 


GGTGTGGTCC 


28920 


GCGTGCAGCG 


CCCCGTCGAA 


GATGCGCATG 


2 8980 


TCGGGCGCCG 


CGGAGCACAG 


CAGCGCCGTC 


29040 


TGCAGCGTGA 


GCATGGCGGG 


CCCGTCGACG 


29100 


ACCGTGTTGG 


CCACCAGATT 


GGCGGGATGC 


29160 




CCTCGCCGGG 


GGCGATCTCC 


9Q99ft 


ACGCGGTCGA 


AGCGGACCCC 


CGCGGTGCAG 


29280 


ACGTAATAGT 


AGATTTTGTG 


GTGGACGGTC 


29340 


GCGGCGTCGG 


CCGCGCGGGC 


CTGGGTGTTG 


29400 


TCGGCCGTCG 


CCACGTTGCA 


CGCCGCCGCG 


29460 


CGGCAGTCTC 


GGTGGCGGTC 


CAGGGCCGCG 


29520 


AGGGGCGGCA 


GCAGCGCCGG 


GTCGCGCATC 


29580 


CCCAGCTCCG 


GGCCCGGCAG 


GGTAAAGTCG 


29640 


537 
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TCCACCAGCT GGGCCAGGGC CTCGACGTGG 
TCCTCGGGGA GGTCGCCCCC GAGGTAGGTC 
ACGAACGCCG CGCATCGCGT GTTGTTCCAG 
GCCAGGGCGC AGAACACGTG CTCGCTGCCG 
5 GCCGGGTAGC TGCGGTCCTC GAACGCCCCG 
GCGTGGCGGC CAACGCCGAG CTCCAGGCCC 
GCCAGGGGCA GGTTGCCGTT GACCACGCGC 
GGGGGGACTT CGCCGCCGGG AAGCTCGACG 
GGGTGCAGCT CCAGAGCCAG GTTGGCGTTG 

10 TGGCACTCGG CGACCCACCG GACCCGGCCG 
AAACGCTGCT GCATGTCCGC GCCGGGGCCG 
TTCGCGGCCT CGACGGGGTC GTGGTTCACG 
GAAGGATGAC ACACGGTCCC GACCGCGTTC 
TTTCCCCAAA AAAACAGCTG CCGGGGAGGG 

15 GGCACCAGGT CCCCGGCGTG CGCGGCGAAG 
GGCAGGACGA ACGTCAGGTC CATGGCGCCC 
TAGATGCGCT TCTCCAGGGC CTCCAGGAAG 
GCGCGCACGC GCGTTGTCTG GGGGGCGCTT 
TCGAGTTGCT CCTCCTGCAT CTCCAGCAGG 

20 GCCTTGCCCA TCACCAGCGC CGTGACGAGG 
GTCACCGGCA CGTCGGCCTC GGTGTCCTCC 
TTGATGGCGG CGGTGGTGAC CAGCACCCCG 
GTCAGGCGGG GCACGGCCAC GGACGGCTGC 
TCGATGGCCT CGCGGCGATG GCCCGCCTTG 

25 CGCTTCAGCT CGGCGACCAG GGTCGCCCGG 
AGATATCGTT GCATGGGCAA CAGCAGGGCC 
AGCATCTGGT CGGCCGTGCC GCGCTCAAAC 
AGCTGCTGGA TGGCGCGCAG CTGGCGATGC 
CCCGTGAGCA GGGCAATGGC CTCGGTGGCC 

30 TCGATGACCT TCGTCATGTA ATTATGCACG 
GCGATGAGGG GCTGGTGGAC CTCGAACTGC 
GGGAACTTGG TGCACACGCA CGCCACGGAC 
AGGGTGTTGC AGTAGGACCC CAGCAGGGCG 
TCGGAGCGCA CGCGGGCGAA AAAATCAAAG 

35 CTCAGGATGG AGCCGGTGGG CACCATGGCC 
GCAGGAGCGG CCATTGGGTT CCTTGGGGGA 
AGGCCAAGCC CCTCCCACAC AACGCCTCAC 
CCGCCAAAAA CCCCAAGGGG CAACCCGACC 
GGGGGCGTTG GGAGGCAAAA AGAAAGAAAA 

40 CGTCCTCTGT CCCCGAGCAC CCACTGTGCC 
TTATATACCC CCCCGCCACA CCCCCGTTAG 
GTCCAAAAGC GTGCTAGAAA AAAGTTGGTA 
CACATGGCGG CGCCGGCCGC GCAGGCGATT 
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AACCCACGGG 


31860 


AACAACAGGC 


GAGGGGAGGA 


AAGGCGTAAA 


31920 


CACCCAGACG 


TAGGCCCGAG 


GACCGGCCGG 


31980 


CAACAGGCAC 
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TGCCCCTGCC 
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AACGCGACGG 


GTGCCTTCAA GATGGCCCTG 
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CGGGGCAGCA 
TGGGCGGCCG 
AACACGGGTG 
CGTGCCAGCC 
GCCCGTCCGG 
AGGAACGACC 
AGACGCGAGT 
TGTTCATCGA 
CGATCCAACA 
AGCGAAACCG 
GGGGGGCGGA 
' GGTAGTATCC 
GCCGTTGGGC 
GATGTGGTTC 
CGGTTAGTCA 
CCAGAGGGCC 
GGAAGCCGTC 
CCGGCCACAA 
CCACCGCCAG 
AAGTCCGCCC 
GGACGTTGTG 
CGTTTATTCC 
GGTGGTCCGG 
CCGCCAGCGC 
CGTGTGTCTG 
TGAAGTGCTG 
GCTGGTGACC 
GGCGATCGTC 
GGCCCAGGCG 
TGACGGCATT 
GATCACGGGC 
CAAGCGGGCC 
GAGCGTCTCT 
CCAGGAGGCC 
CGACCGGGCG 
CGTGCATGGG 
GTTTCTCGGG 
CATCCACCGG 
CATCCAGCGC 
GAACGGGCTG 
CGACCTCCTC 
CCTGCTGCGC 
CGTCATGTAC 



GCTGGCTGGC 
CCACCAGGGC 
GCGACAACGG 
ACGTGACATA 
TCCACCCGTA 
AAGCAAACAT 
AGGCCGAACT 
GCGGCAGTTC 
AAGGAACATC 
AAAGTAGTGC 
GGCCAAAATC 
GCGGATGCGA 
CCGGAACCCC 
GCGAGCCGCA 
GTGGGAAGGC 
ACGTCGGGAA 
GTTTGCCCAA 
CCCTACGCGC 
CCCTTGCTCC 
CGTGGCTCGC 
TTTTACGTCA 
GTAGGGCGGC 
GGGACAGGCC 
GGCCTGCGGG 
CTAAACCCGA 
GACGAATGCC 
GGGGTGCGCG 
AACATTTCCT 
CACCTCCCCC 
CCCGCCCCGC 
ACCCGCGCCC 
ACCGTCAGCG 
TCCGCCCCCC 
GCCCCGCCGG 
CTGGAGGAGC 
TTCCGGGAGC 
GCCGCGCTGG 
GAGCGGCGCA 
CACGGCCTGT 
TTCCGCGACG 
CCGCCCAAGG 
TTTGTGGACT 
CTCGGCGCGT 



GGTGATCCAA 
GCACAGGGCC 
CAGGCGATCC 
GTAGGCGAGG 
ATACATGCCC 
CACCACCCGC 
GACAAAAAAA 
GCCGTCCTCC 
ATCCCGCATT 
TGGCGGCGCG 
AAACAAGCAC 
GTGCCTGGCG 
CGAAATTCAC 
CATCCGTGCG 
AGGGGGAAAG 
TGCGCCCGGA 
GCACCGACGC 
GGGCGGCACG 
CACCACCCTC 
CGGCCATGGA 
CGGCAGACAG 
CTCGGGATTC 
CCAAAGACCG 
ACGTGCGGCC 
ACGTGAGCTC 
TGGCCGAATA 
TGCGCGCGCG 
CGCGCTTCGC 
GGCTCCCGAG 
GCCAGCCCCT 
CCAGACCGAT 
AGTTCGTGCA 
CGCCGAGCGC 
GCCCCCCGCT 
CCCACGCCGA 
AGGCGTGGAA 
CCCTGAGCCC 
TGTCCCCCTT 
ACGTTCCCGC 
CGCTGGCGGC 
ACGTGCCGGT 
CGCAACGCCT 
TCCTGGGCGT 



TGGAAAAGCC 
GCGCCGCCCA 
CGTTTGATGT 
ACGGCGGCTA 
GCGGCCACCA 
TTGGAAAAGA 
TCAGACGTGC 
CCGCCACACG 
GTCATGGTCG 
GGCCCGGGTC 
CGCGCGGGTT 
AAGTCACGTC 
ACCCACGCCC 
TCCGCCCTCC 
ATGGGTTGGG 
GTTGTCCTTA 
CGCGATCCAC 
CGCGAGAGCA 
CTCCCACCAC 
GCTCAGCTAT 
AAACCGCGCC 
TCAGCCGGGG 
CATGGTCGCC 
CGTGGGGGAG 
CGAGCGAGAC 
CTGCACCTCG 
AGACAGGGTC 
GTACACCCCC 
CTCGCTGGAG 
GGACGCCCGC 
GGCCGGGACC 
AGTGAAGCAC 
CCCCGACGCG 
CAGGGAGCTG 
GTCGGGATTG 
GCTGTTTGGG 
GACCCAAAAG 
CCCCGCGCTC 
GCCCGACGAA 
CGGGACCGTG 
GGGGAGCGAC 
GACCCCGGGG 
GTTGTACGCC 

539 



CGTCGGGACT 

TGATCACGCA 

TCACGTACAG 

TAATACATGC 

GCTCCAGCGG 

CCGGCTGGGT 

CGTACGAGGA 

CGGCCTCGTA 

GTGCGGGGAG 

CGGACCCAAG 

CTACACACAA 

CCAGCAGGAT 

TGACGCCCAA 

CCCGCGGGCT 

GGAGGAAACG 

AAAGGCCGGC 

AGTGGGGGGA 

ACCCACGGGT 

CCCACTATTC 

GCCACCACCC 

TACTTTGTGT 

GAAATTGCCA 

AACTACGTAC 

GACGAGGTGT 

GTGATTAATA 

CTGCGAACCA 

ATCGAGCTAT 

TCCCCCTACG 

CCCCTGGTGA 

GACCGGCGCA 

GGGGCCGGGG 

ATCGACCGTG 

AGTCTGCCGC 

TGGTGGGTGT 

ACGCGCGAGG 

TCGGTGGGGG 

CTCGCCGTCT 

GTGCGGCTCG 

CCGACGTTGG 

GCCGAGCAGC 

GCGCGGGCCG 

GGGTCGGTCT 

GGCCACGGAC 



GAACGTCTCA 

CAACCCCCAA 

GAGGAGCGCC 

CGGCGCCACC 

CTTGAGGACC 

GTGGGGCGGA 

CAGCGAAAAC 

TACCAGCTCG 

CCGGCGAGGC 

CTTCAGGGAT 

CCCCCACCCG 

ATAAACCTCG 

ATCATGGGTG 

GATGACGTGG 

AAGAAAACAC 

CGTGCGTGAC 

GTTCCTCCGT 

CCCGTTCGCG 

CCCCCCCCCC 

TGCACCACCG 

GCGGGGGGTC 

AGTTTGGCCT 

GAAGCGAGCT 

TCCTGGACAG 

CCAACGACGT 

GCCCGGGGGT 

TTGAGCACCC 

TATTCGCCCT 

GCGGCCTGTT 

CGGATGTCGT 

GCGCGGGGGC 

TTGTGTCCCC 

CCCCGGGGCT 

TCTACGCCGG 

AGGTCCGCGC 

CTCCGCGGGC 

ACTACTATCT 

TCGGTCGGTA 

CCGATGCCAT 

TCCTCATGTT 

ACAGCGCCGC 

CGCCCGAGCA 

GCCTGGCCGC 



32280 

32340 

32400 

32460 

32520 

32580 

32640 

32700 

32760 

32820 

32880 

32940 

33000 

33060 

33120. 

33180 

33240 

33300 

33360 

33420 

33480 

33540 

33600 

33660 

33720 

33780 

33840 

33900 

33960 

34020 

34080 

34140 

34200 

34260 

34320 

34380 

34440 

34500 

34560 

34620 

34680 

34740 

34800 
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GGCCACGCAT ACCGCGCGCC TGACGGGCGT 
CGACCGGATG TCCGCGTTTG ACCGCGGGCC 
CGGGTACCTG GACGCGCTGC TTACCGTTTG 
GTGAGATATC CCAATAAAGT GCAGTCGTTT 
5 ACGGGGGACT ATGGGGGGGG GGGGAAAGGA 
CAGAGGCGGT AGCGGACGCA CGGCGGACAC 
GTCGGTTGGG TTGGGCGTGG ACGCCGCTGC 
AAAAATGGGA CGCACGTTCG GACCACCCTG 
ACGACCCCCA GCGCGGACGC GGCCAGAAAC 

10 TCAAAGGCCA GCAGATGAAT CACAGTTCCG 
ACGTCGCTGG AAAACACGTT CGGGGTGCCC 
GTGGCATCCG TGTCCACCAG CAGCACCGAC 
AACACGGCCC CCACGAGGCC GAGGTCGCGC 
TCAATCTCCC GCGCGTGCCC TTCGCAGGTG 

15 CGGACGTCAA CGCCCGTAAG CTTGTATCCG 
ACGTAGCTGG CGTTGTGGGT GATGGGCACG 
CCGCTACACT GGTGGGTGGC CTCCGGGACG 
CAGCGCGTGA GAACGGAGGC CACGCCGCGG 
TCGGATCGGG TGGCCATGGC CAGCGCGTCC 

20 CGCAGGGAAG CTGCGCATGG GGAAAAGTGG 
TCGGTCCTGG CTAGCGCGGC CCGGAGATCG 
CACAGGGCCG TGGTTATGAG GAGGCCCCGG 
GCGCCCGCCA GGAACGGCGC CCGGAGGACG 
ATCGGGGCGG TTAGCGCGCG GCCGCCGAGA 

25 GCCGCGCTCG GGGCCACCGC GCCATAGGCC 
TAGCCCAGCG CGTGCGCCGC CAGGCTCTGC 
CCGAGGCGCG CCTCCAGCCG CAGGCGGGCC 
ACCGAGTCGG CCGCGCAGCC CGCTGCTCCC 
TGGGCCAAAA AGCCCAGCAG GTCGGAGAGG 

30 ACGAACGCAA ACCCCGACGA GGCGAGCAGC 
GCGTCCGTGC CGGAGCCCGG GTCCTCCCCC 
TGGGCGTAGT TCGTGCTCTC CTCGGGGTAG 
GAGCCGTTGT CGGCGGGCGT CGGGGCCCCC 
GGAGGCCCGG GGAGCACCGC GGGGGCGTTT 

35 GTCTTGTCCG CAGGCACCAC TATGATCTCG 
AGCCCCATGA AGCCCTTCCC GTATCGCGCG 
AGCCCGCCCG TCGTCCAGAC GCCCACGGGC 
TACCGACCCG GAGTCCGTAG CAGGCCCCTG 
AGATGCGCGA TGCTCAGGTT CGTCGTCGGA 

40 GGCGGCGCGT TGCGTCGGCC GTCCGGGTGC 
AACGTAAGCC CCTCGCGGTC CGGCGCGGCC 
AGCGCGGAGG CGCCGGGGTT GTGCGACAGT 
CGGGACGTGG GCCCCGCCTC GGGGAGCTCG 
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\J A\- \J 1 L L C rG 


GTCCTGACCG 


TGGGGGACGT 


34860 




GCTGGCCGCA 


CGCGAACCGC 


34920 


CCTGGCTCGC 


GCCCAGCACG 


GCCAGTCTGT 


34980 


TCTAACCCAC 


GGATGCCGTT 


GTATGCCTAT 


35040 


AAGGAAACAG 


GAATGGAGAA 


GGGAAAGGAA 


35100 


AATAACAAAC 


AGACCGCGGA 


CACGGAGGGA 


35160 


GTCCACACAC 


CCGTTTATTC 


GCGTCTCCAC 


35220 


AGGATGCCCG 


CCAGGGCCGC 


GGTGATCATA 


35280 


CCGGGGGCGA 


TGGTGGCGAT 


GGGCAGCGTG 


35340 


TTGGGGAACA ACAACAGGGC 


CACGGACGGC 


35400 


GCCACCGGCC 


CCTGGGCCAG 


CTGCTGTTGG 


35460 


ATGACCTCCC 


CGGCCGGGGT GTAGCGCAGA 


35520 


CGGTTTTCGG 


TGCGCACCAG 


l_L.L>U i 1 CVsviV-. 


35580 


GCGGTGAGAT 


AGGTGATAAA 


CAGCGGGCGG 


35640 


ATCCCGCGGG 


GCAAGGGGGT 


GTGGGTGACG 


35700 


AGGATCCGGG 


GCTCCGCGTT 


GTGCGACGGG 


35760 


AAGGCGCGGA 


TCAGGGCGTT 


GTAGTGCGCC 


35820 


GTCTGTTGTG 


CCATGACGTC 


CGCCGGGATG 


35880 


AGGATGAACC 


CGCCCTCGGC 


GAGATCGAAG 


35940 


TCCGGGAGCC 


AGAAGAGGTT 


TTTCTGGTGG 


36000 


GCGTGGGTCG 


CCGCGGCGAC 


GTCGGACGTA 


36060 


CGGGCGCGTT 


CCCGCTGCTC 


GGCCGAGGGC 


36120 


GCCGTGGCGT 


AAAACAGCGC 


TCGGCGGACC 


36180 


AACTCGGCGT ACAGGGCGTC 


GATCAGGCGG 


36240 


GCGGGGCTGT 


CCAACACGAA 


CGCCAGCTGA 


36300 


TCTCGCTCGA 


GGATCGCGGC 


CACCAGATGC 


36360 


GCCGGGTCCA 


ACACGGACAC 


GTTCAGGAAC 


36420 


CGGGCGGCCA 


GGCCGGCCAG 


CACGCGCGAG 


36480 


CGAATCGCGT 


CGTGGGCGTG 


GGCCGCGTTG 


36540 


CCCGCGAGGC 


GCCAGAACAG 


GGACGGACGC • 


36600 


AAAAACTCCG 


CATAGGCCCG 


CGACATATAC 


36660 


CCGGCCACCC 


GCCGGAGGGC 


GTCCAGCGCC 


36720 


AGGACAAAGA CGCGATACCT GGGGCCGGCC 


36780 


TCGTCGGTCG GATTTCCGAC 


CCGAGCGAGG 


36840 


GCCGGAGGGC 


TGTCCCGCAT 


CGATATCACG 


36900 


CGCACGAGCG 


CGGCGTCGCA 


CCCGAACGCC 


36960 


CACGTCGAGG 


CCGACGGGGA 


GAGGTACACG 


37020 


GCGGCCAGCC 


AGGTCACGGA 


TGCGTTGTGC 


37080 


TGCCTCGGTG 


TCCCCGCGGG 


CGGCCCCGGG 


37140 


CTCTCGGTCG 


CCCCGTCGTC 


TCCCCGCGGG 


37200 


GCGAATGTTA 


CCCAGGCCCG 


GGACCGCAAC 


37260 


CCCTTGAGCT GGGTCACCTC GGCGGGGGGA 


37320 


GGCAGGCTCG 


CGTTCCGAGG 


CCGGCCGAGC 


37380 
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AGATAGGTCT TTGGGATGTA AAGCAGCTGC CCGGGGTCCC GAGGAAACTC GGCCGTGGTG 37440 

ACCAACACGA AACAAAAGCG CTCGGCGTAC CACCGAAGCA TGGGCACGGA TGCCGTAGTC 37500 

AGGTTGAGTT CGCCCGGGGG CGCCAAGCGT CCGCGCTGGG GGTCGCTGGC GTCGGGGGTG 37560 

TTGGGCAACC ACAGACGCCC GGTGTTTGTG TCGCGCCAGT ACGTGCGGGC CAACCCCAGA 37620 

5 CCGTGCAAAA ACCACGGGTC GATTTGCTCC GTCCAGTACG TGTCATGGCC CCCGGCAACG 37680 

CCCACCAGGA CCCCCATCAC CACCCACAGA CCGGGGCCCA TGGTCGTCCG TCCCGGCTGC 37740 

CAGTCCGCAG ATGGGGGGGT GTCCGTACCC ACGGCCCAAA GAGGCTCCGC ACCTCGGAGG 37800 

CTATCGGAGG CCCTTTGTTG CCGTAAGCGC GGGCCAAAGG ATGGGGTGGG GTGAGGGTAA 37860 

AAGCACAAAG GGAGTACCAG ACCGAAAACA AGGACGGATC GGCCCGCTCC GTTTTTCGGT 37920 

10 GGGGTGCTGA TACGGTGCCA GCCCTGGCCC CGAACCCCCG CGCTTATGGA CACACCACAC 37980 

GACAACAATG CCTTTTATTC TGTTCTTTTA TTGCCGTCAT CGCCGGGAGG CCTTCCGTTC 38040 

GGGCTTCCGT GTTTGAACTA AACTCCCCCC ACCTCGCGGG CAAACGTGCG CGCCAGGTCG 38100 

CGTATCTCGG CGATGGACCC GGCGGTTGTG ACGCGGGTTG GGATCATCCC GGCGGTGAGG 38160 

CGCAACAGGG CGTCTCGACA CCCGACGGGC GACTGATCGT AATCCAGGAC AAATAGATGC 38220 

15 ATCGGAAGGA GGCGGTCGGC CAAGACGTCC AAGACCCAGG CAAAAATGTG GTACAAGTCC 38280 

CCGTTGGGGG CCAGCAGCTC GGGAACGCGG AACAGGGCAA ACAGCGTGTC CTCGATGCGG 38340 

GGCAGAGACC CCGCGCCGTC CTCGGGGTCG GGGCGCGGGG TCGCCGCGGC GACCCCCGTC 38400 

AGCCGGCCCC AGTCCTCCCG CCACCTCCCG CCGCGCTGCA GGTACCGCAC CGTGTTGGCG 38460 

AGTAGATCGT AGACACGGCG AATGGCGGAC AGCATGGCCA GGTCAAGCCG CTCGCCCGGG 38520 

20 CGTTGGCGTC TGGCCAGGCG GTCGGCGTGT TCGGCCTCCG GAAGGACACC CAGGACCAGG 38580 

TTCGTGCCGG GCGCGGTCGG GGGCATGAGG GCCACGAACG CCAACACGGC CTGGGGGGTC 38640 

ATGCTTCCCA TGAGGTACCG CGCGGCCGGG TAGCACAGCA GGGAGGCGAT AGGGTGCCGG 38700 

TCGAAAACAA GGGTGAGGGC CGGGGGCGGG GCTTGCGGGC CCACAGCCTC CCCCCCGATA 38760 

TGAGGAGCCA AAACGGCGTC CGTCGCCGCA TAAGGCGTGC TCATTGTTAT CTGGGCGCTG 38820 

25 GTCATTACCA CCGCCGCCTC CCCGGCCGAT ATCTCGCCGC GGTCCAGACG GTGCTGCGTG 38880 

TTGTAGATGT TCGTCAGGGT CTCGGAGGCC CCCAGCACCT GCCAGTAAGT CATCGGCTCG 38940 

GGGACGTAGA CGATATTGTC GCGCGGCCCC AGGGCCTCCA TCAGCTGCGC GGAGGTGGTG 39000 

GTCTTCCCCA CCCCGTGGGG TCCGTCTATA TAAACCCGCA GCAGCGTGGG CAGCTCCGGA 39060 

TCCCCGCGGG CTTCGGAGGC CCCCTGGCGA TGGCTAGGAC GGGACGCCGC GCGGCCGTCG 39120 

30 GTAGGCCCGC TCGCACGAGC AGCCTGACCG AACGCAGGCG CGTGCTGTTG GCCGGCGTGA 39180 

GAAGCCATAC CCGCTTCTAC AAGGCGTTCG CCCGAGAGGT GCGGGAGTTC AACGCCACCA 39240 

GGATTTGTGG AACGCTGCTG ACGCTGATGA GCGGGTCGCT GCAGGGTCGC TCGCTGTTCG 39300 

AGGCCACGCG CGTCACCTTA ATATGCGAAG TGGACCTCGG GCCGCGCCGC CCAGACTGCA 39360 

TCTGCGTGTT CGAATTCGCC AATGACAAAA CGTTGGGAGG TGTGTGCGTC ATCCTGGAGC 39420 

35 TAAAGACATG CAAATCGATT TCTTCCGGGG ACACGGCCAG CAAACGCGAA CAGGGGACCA 39480 

CGGGCATGAA GCAGCTGCGC CACTCCCTGA AGCTGCTGCA GTCGCTCGCG CCTCCGGGGG 39540 

ACAAGGTCGT CTACCTGTGT CCTATTTTGG TGTTTGTCGC GCAGCGTACG CTGCGCGTCA 39600 

GCCGCGTGAC CCGGCTCGTC CCGCAAAAGA TCTCCGGCAA CATCACCGCG GCCGTGCGGA 39660 

TGCTCCAAAG CCTGTCCACG TATGCCGTGC CGCCGGAACC GCAGACCCGG CGGTCGCGGC 39720 

40 GCCGGGTCGC CGCGACCGCC AGACCGCAAA GGCCCCCCTC CCCGACACGT GACCCGGAAG 39780 

GCACGGCGGG TCATCCGGCC CCACCAGAGA GCGACCCCCC CTCCCCAGGG GTCGTAGGCG 39840 

TCGCTGCGGA GGGTGGGGGT GTGCTTCAGA AAATCGCGGC GCTTTTTTGC GTGCCGGTGG 39900 

CCGCCAAGAG CAGACCCCGG ACCAAAACCG AGTGAGGTTC TGTGTGTTGT TTTTTTTCCT 39960 

541 
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CGTTTTGTTT TCTCTTCTTT CCCCCCCCCC 
GCTTAAGCGG AACCCGCGGG CGCGCGGGGA 
AACAGCCCCT GGGTGTAGAC CGCTGTCGCC 
CCTCAAAGAA CGTGGTGTTG GGCGCCGGCC 
5 CCGCCGCCCT CGAACATGGA CCCGTACTAC 
AGGCGCTTCA TCGTCGCCGA CTCCAGGAGC 
TGGATGTTGC CCGTGTTCAA CATCCCCCGG 
CAGGCCCAGC GCACCGCGGC CGCGGCGGCC 
CTGCCCGTCG ACATCGAGCG CCGGATACGC 

10 GACGCCCTGG AGGCGCTGGA GACCGCGGCG 
GACGCCGAGG CGAGGGGGGA GGGCGCTGCG 
CCCGCCGCCG CGGAGATGGA GGTTCAGATC 
ACCAACCTCC CCGTGGATCT GCTACACATG 
TCGGGAGTCG TCTTTGGTAC CTGGTACCGC 

15 CCCCTGACCA CCCGCAGCGC CGACTTTCGA 
GCGCTGGTCC TGTCTCTGCA GTCGTGCGGC 
GCCTTCGAGT GCGCCGTGCT GTGTCTGTAT 
CCCGATCGCG ATCGCGCTCC CGTTGCGTTC 
CTGGCGCGTC TGGCCGCGGT AATCGGCGAC 

20 GACGACAAGC TGCCCAAAGC GCAGTTCGCG 
CTGGCCACCC ACGTCGTGAT CGCCACGTTG 
GGCGACGTTC CCCGAGACAC CAGCACCCGC 
GACGTCAACC GCGCCGCCGC . CGCGTTTTTG 
GACCAGACGC TGCTGCGGGC GACCGCCAAC 

25 CTCCTCGCGA ACGGCAACGT GTACGCGGAC 
CTGATCCCGG GAGCCGTCCC GGCGGAGGCC 
GGCGCCATAA AAAGCGGCGA CAACAACCTG 
CTGTATCAGG CAGACCCCAC GGTCGAGCTG 
TGCCTGGACG CCCAGGCGGG GCGGCCACTG 

30 TCGGGCGCCC GCCAGGCGGC GCTCGTGCGC 
CGCACAAACA CCACCCCTGT GGGGGAGATT 
TACGAACAGG GCCTGGGGCT GCTCGCCCAG 
AAGCGATTCG CCACGTTCAA CGTGGGCAGC 
GGGTTCATTC CCCAGTACCT GTCCGTGGCC 

35 TGTTTTTCTG CTGTTGTTGT TTCTGGTCCG 
CGCGGGCTTT AGTCCCGGCC CGGACGTCGG 
TGGGTAAGTT GGTTCGGGGG CATCGCTGTA 
GTTTTGTTTG TTTGTGCGGG TGCCCATGGC 
GCCTCTGCCC GACCGGGCGG TGCCCATCTA 

40 CGGGGACCCG GGCGAGCTGG CCCTGGACCC 
GAACCCCCTG CCGATCAACG TAGACCACCG 
CGTGGTCAAC GACCCTCGGG GGCCGTTTTT 
GCGCGTCCTC GAGACGGCCG CCAGCGCCGC 





P»PPPPP7V TV PP 


AJ.LCXVACCX 




prnp j\ f iMlUlip'rnp 


VjV_V^ooVA?ACA 


CCCACCCGAC 


/1 rt a q n 


CCCvjXC X\jXC 


o\-V~ i i_ 1 v_v_ L 1 


rpmmmmpp PPP 

X 1 X X xvcccc 


/I Pit >t Pi 


AAX lLi XCCC 


rrappppppfp 

VjIjAoL-IjCCVj X 


pprnpppp'pp/^ 

CCjXCGCCCGC 




CCX X XCwACCr 


pppiTtpp. tv pprp 


TTWP A A P A P 

X X\iviGAACAC 


4Uz bU 


rrvrpp % fPP A PPP 

TTCATCACCC. 


PPP A P» Plf >PPP 

CCGAG TTCCC 


CCGGGACTTC 


40320 


p a p a pppppo 
GAGACGGCGG 


PPP A PP^'/~>r~'/^ 

C GGAGCGGGC 


y^/~»/"^ iv pm^< >nt^ 

GGCAGTGCTG 


40380 


pnvr a P A A pn 

CTGGAGAACG 


CCGCCC I CCA 


GGCCGCCGAG 


40440 


PPP TV rnpp A rv* 
C CCtA X v- UAUC 


TV PP A ppfnpp A 
AijCACAjTGCA 


TCACATCGCC 


40500 


C^CCUCCAjCCIj 


A APAPPPPPA 

AAvjAVjoCwiA 


TGCCGCGCGG 


40560 


GACGGGGCAG 


CGCC GTCGCC 


CACCGCGGGC 


40620 


pmT\ PPP TV TV PP 

GTACGCAACG 


A PPPPPPPPrp 

ACCCGCCGCT 


A PP TV rtl A /""»/"' TV m 

ACGATACGAT 


A A c 0 r\ 

40680 


PTPT TV PPPPJP 
blbl ACLjCVjLt 


ot LuL oVj(j CC 


CGCGGGTTCG 


40/40 


A pp tv mpp a pp 


A A PPP A PP A *"P 

AACviCACCA X 


CGCGGACTTC 


a no nn 
4 OoOO 


VjAC lj\jljCljC A 


TPTPP A APAP 

Xaj X\.C AACAC 


pmmp TV TVp A PP 

C XrX CA X\jACC 


a no cn 


CGviC 1 (j I ALvj 


1 vjULxC c agc g 


pp a pmii mrnPP 

C C ACTATTCC 


a rv 0 *r rv 

4oyzij 


PTPPTPTTi pp 

CXvjC XvjIACC 


PA APP A PPP A 

CAAC C AC CCA 


CGAGTCCTCC 


>i no op* 

4oy bO 


r^r^r^c* a pptipp 
GGGG AC C I GC 


rrtpppp/-»pi^>/-~irn 

XXjGC c cgc c t 


GCCGCGCTAC 


41040 


vjAvj AvjC vjo AC 


PPPPPP Aprn A 

GCCCGCAGTA 


CCGCTACCGC 


/i 1 1 nn 
41100 


C^CCHjCCCUCVj 


PPPPPT>APP A 

GCCGC X ACGA 


PP A PPPPPPP 

G C ACGGGGCC 


41160 


GXGCGCCACG 


ppprpppti a pp 

GGGTGCTACC 


/-^/~t/-»y-«tp OP PPP 

GGCGGCCCCG 


412^0 


vj 1 oAACCCCCj 


A PP A P P TIP P P 

ACGACG X GGC 


PP TV PPPPP T\ p 

CCACCGCGAC 


41280 


pp tv p»p»/*»r ,, p»p»/^» 
GCACGCGGCC 


TVP A A PPTPTVp 

ACAACC XC X I 


PPfTV 'MIPPP A P 

C C TGTGGG AG 


41 J4U 


TV PP TV ) 1 II 1 1 TV PPP 

ACCATTACGG 


ppprnppp^"/^rn 

CCCTGGCCGT 


P P rrwTiP PP P PP 

GCTTCGGCGG 


aaa nn 
41400 


ppppmp/--* TV P TV 

CCfCC XCGACA 


ACCGCC rGC A 


PPmPPPP 7v mp 

GCTGGGCATG 
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41460 


a nvppnvppp 
AXX.GC XT_GGG 


GGGCGTC CGG 


A rnfnPP A PTPP 

ATTGGACTCG 


41bz0 


P APPPPPrppm 
IjACAjCVjC X Vj X 


PPPfTUTpTV APT<A 
CCC X X AAC X A 


mpm a pfTirnPPP 

TGTAC7 TCCG 


4 IDoO 


A ppp tv prnrriPTi 
ACCC AC? X X\j I 


iwnpppppppm 

x x^cvjCsCjIjC r 


/— i/~>ppp/^ppfTlp 

UCjCCCCCC X c 


A 1 C A n 

41b4U 


PWWP TV PP A 

VjCLj I CGACoA 


pppppprprrTi 
GGCGCG 


PP A rnTV mPTPP 

GGA TATGTCG 


vi 1 n n n 
41 /UU 


pmp TV PPPPPP 

C 1 C ACCCjCGC 


mpp TV PPTP A m 

TGGACaC X C A X 


p a a r^r^r^r* a pp 
CAACCCjCACC 


41 /bO 


AT'TA AP'P'PPP 


A PP A rppppfiwp 
ALbAluuL X X 
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CCtAjVjA X AC AA 




p TV ppp Tv PPP A 


rnPPPPTVTVPP 


P1*PP?i AP"P , PP 

O X CvjAACLiCC 


41 QQA 


p 7a pt a pp a p^p* 
CviC 1 ACUACC 


TGTTGTACTT 


TTTGTGTCTC 


4iy 40 


TAp^pp a jppp 


rnppppPTipPT" 
XXaGGGG X CC X 


Kj\j XCKa X bCviCi 
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PP mpprnp tv p tv 
CC I vjv? X CACA 


AAA PPP 1 A PPP 

AAAGGC AC Uvj 


CUCCCCoAAA 


4zUbU 


CGGACACACA 


ACAACGGCGG 


GCCCCGTGGG 


42120 


TTCCCTTGCC 


CGCTTCCACC 


CCCCCTTCCC 


42180 


GTCGGCGGAA 


ATGCGCGAGC 


GGTTGGAGGC 


42240 


CGTGGCCGGG 


TTTTTGGCCC 


TGTACGACAG 


42300 


AGACACGGTG 


CGTGCGGCCC 


TGCCTCCGGA 


42360 


CGCTCGGTGC 


GAGGTGGGCC 


GGGTGCTCGC 


42420 


TGTGGGGCTG 


ATCGCGTGCG 


TGCAGCTGGA 


42480 


TATTTTTGAG 


CGCCGCGGAC 


CCGCGCTCTC 


42540 
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CCGGGAGGAG 
AAAACGCCGG 
CGCCATCGGG 
CGCTCCGTTT 
5 GGCCGAGCTC 
GCTGCTCTCC 
GCGGCGGCGG 
AATATGGGGG 
TGCCATGGAC 

10 TCAAGTCGCG 
CGTTTCGGCA 
CCCCGCCTTT 
GACCGCGTGC 
GTCCCCGCAC 

15 CAGTCCCCTG 
GGGTGGGCTT 
ACACGAGGTG 
GTATTACCCG 
CCAGGCTTCC 

20 GCAGGAACTG 
GGGGCCCTAC 
CGCCGAGGCC 
GGCGGTCCCC 
ACATCAGCCC 

25 CCCCGCCGCG 
CGCCAGCAGC 
ACAGATGATG 
TTCATATATT 



30 



35 



40 



TAGGTCGGAG 
GGAAGGCCCG 
GGGGTTGGTT 
TTTCTAGTCA 
TGTGTATTTT 
CCTTAGAGCT 
TTGTTGCGCT 
TTCTTTCTGG 
ATTTCTCGGG 
CCCTCCCCGC 
AGGGCCTTCA 
ACCAGGCCGG 
GAGGACACGC 
ACCGCGCGCC 



CGTCTGCTGT 

GGGGACGAGG 

CGGCGCCTTG 

CGCCACCTGG 

GCGCTGGCCG 

ACCGCCGTCA 

CAGGCCGGGA 

GCGGAGTCTG 

ACATCCCCCG 

TCGTCGTCGT 

TCGGGCGCCC 

CATTACAATC 

GGCCTGCCGG 

TACCCGCCTC 

GAGGCCCAGA 

CCGGCGGCCG 

GAGCAGCCGG 

GGCGAGGCCC 

GGGCCCCACG 

GCGCACATGC 

CACCACCCCC 

GTCTATCTGC 

CCACCCTCGT 

TCCCCCGCAC 

AGCTTACCCC 

GCGGCCCACG 

GGGTCCCGCT 

TTAAATAAAC 

AGAGGGGGGG 

GGGTGAGGGG 

GAAGACTACC 

TGGGGTTGGT 

CACGCCCCCC 

TGAACATCGG 

CGTCTTCGTC 

TGCGCAGAAC 

CCTTGTGTTC 

CCTCGGCCAA 

CCACGCCCCC 

TGGGATTGCG 

CCAGGACCAG 

CCGAGACGGC 

CCAAGTCCCC 



ACCTGATCAC 
TTCCGCCCGA 
GAACCATCGT 
ACCCGGCGAC 
GGCGCACCTG 
ACAACATGAT 
TCGCCGGACA 
CCCCTGCGCC 
CCGCGAGCGT 
CTTCTTCTTC 
CGGCCCCTCC 
AGCTCGTCAC 
CCGCGGGGAC 
CTCCCGCCCA 
TCGCCGCGCT 
CCGGAGACCA 
AGTACGACTG 
GCCCCGAGCC 
AAACCATCAC 
GCGCGCGTAC 
ACGCAGACAC 
CGCCGCCGCA 
ATCCCCCAGT 
ACGCCCACCC 
AACCCGAGGC 
TGAACGTGGA 
AACTCGCCTC 
AAACAACCGG 
GGTGGAGTGG 
GGGGGGGCTA 
ACGGGGAGGG 
TTGGGGTTGG 
CCCCCAAATA 
TGTCTTTTTA 
TCCGGCCTCG 
CATGTTGGTG 
CGTGCGCTCC 
CTTGGCCTCG 
GGGGTCGGAA 
TTGCAGTTGC 
CAGCCCCACG 
CGACACCACG 
CATCCCCTCG 



CAACTACCTG 
CCGCACCCTG 
CACCTACGAC 
GCGCGAGGGG 
GGCCCCCGGC 
GCTGCGTGAC 
CACGTACCTT 
GGAGCGCGGG 
TCCCGCGCCG 
TTCTTTTCCG 
GCCGCCCGGC 
CGGGCAATCC 
GGTGGCCTAC 
CCCGTACCCG 
GGTGGGGGCC 
CGGGATCCGG 
CGGCCGTGAC 
GCGCCCGGTC 
GGCGCTGGTG 
CCACGCCCCC 
GGAGACCCCC 
CATCGCCCCC 
TGCGGTTACC 
CCCTCCGCCG 
GCCCGGCGCG 
CACGGCCCGG 
CAGGATCCGG 
ACAAAAGTAT 
GGGGGAAAGT 
GGAGCCGAAC 
GGTGTGGAAA 
TTTTCCCGTT 
AAAACCAAGG 
TTTATACACA 
TCCTCGTTGT 
ACCTTGGAGC 
ATGGCCGACA 
TCAAACCCGC 
GTCTTGAGTT 
AGGACGTAGC 
GCAAGCGCCC 
CCCCCCACTA 
AAGAACGCGC 
543 



CCATCGGTCT 

TTTGCGCACG 

ACCAGCCTAG 

GTGCGACGCG 

GTGGAGGCGC 

CGCTGGAGCC 

CAGGCGAGCG 

TATAAAACCG 

CAGGTCGCCG 

GCACCGGCCG 

GACGGGAGTT 

GCGCCCCACC 

GGACACCCCG 

GGTATGCTGT 

ATCGCCGCCG 

GGGTCGGCGA 

GAGCCGGACC 

GACTCCCGGC 

GGGGCGGTGA 

TACGGGCCGT 

GCCCAACCAC 

CCGGGGCCTC 

CCCGGTCCCG 

CCGCCGGGAC 

GAGGCCGGCG 

GCCGCCGATC 

ACTTGGGGGG 

ACCCACTTCG 

GGGCCGAATG 

CGATGGCCCC 

GCGACCGGTC 

AGCACATGTC 

CAAAACAATA 

AGCCCAGCTC 

GGAGCGGAGA 

TGAGCAGGGC 

CCAAAGCCAT 

CCCCCTCCGC 

CCTTGGTGGT 

GGAAGGCGAA 

CGAAGGGGTT 

CTCCCATGAC 

ACAGCCCCGC 



CGCTGTCCAC 

TGGCCCTGTG 

ACGCGGCCAT 

AGGCCGCCGA 

TCACACACAC 

TCGTGGCCGA 

AAAAATTTAA 

GCGCCCCGGG 

TCCGTGCGCG 

ATATGAACCC 

ATTTGTGGAT 

ACCCGCCGCT 

GCGCCGGCCC 

TCGCGGGCCC 

ACCGCCAGGC 

AGCGCCGCCG 

GGGACTTCCC 

GCGCCGCGCG 

CGTCCCTGCA 

ATCCGCCGGT 

CCCGCTACCC 

CTCTATCCGG 

CCCCCCCGCT 

CCACGCCTCC 

CCTTAGTTAA 

TGTTTGTGTC 

GGTGTGTGTT 

TGTGCTTGTG 

ACACAAAAAT 

CACACGCGAC 

GCAGGGAGAC 

TGCATTTGTT 

CCAGAAGTCA 

CCCTCCCCTC 

GTACCTGGCT 

GCTCGTGCCC 

ATATCGGATC 

GCCTTCCTCC 

GAGCGGATAC 

GAAGGCCGCG 

GGACATAAAG 

TACCTTGCCG 

GAACATGGCG 
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GCGTTGGCGT CGGCGCGGAT GACCGTGTCG ATGTCGGCAA AGCGCAGGTC GTGCAGCTGG 45180 

TTGCGGCGCT GGACCTCCGT GTAGTCCAGC AGGCCGCTGT CCTTGATCTC GTGGCGCGTG 45240 

TAGACCTCCA GGGGCACAAA CTCGTGGTCC TCCAGCATGG TGATGTTCAG GTCGATGAAG 45300 

GTGCTGACGG TGGTGACGTC GGCGCGACTC AGCTGGTGAG AGTACGCGTA CTCCTCGAAG 45360 

5 TACACGTAGC CCCCGCCGAA GATGAAGTAG CGCCGGTGGC CCACGGTGCA CGGCTCGAGC 45420 

GCGTCGCGGG TGAGGCGCAG CTCGTTGTTC TCGCCCAGCT GCCCCTCGAT CAGCGGGCCC 45480 

TGGTCTTCGT ACCGAAAGCT GACCAGGGGG CGGCTGTAGC ACGTCCCCGG CCGCGAGCTG 45540 

ACGCGCATCG AGTTCTGCAC GATCACGTTG TCCGGGGCGA CGGGCACGCA CGTGGAGACG 45600 

GCCATGACGT CTCCGAGCAT GCGCGCGCTC ACCCGCCGGC CGACGGTGGC GGAGGCGATG 45660 

10 GCGTTGGGGT TGAGCTTGCG GGCCTCGTTC CAGAGAGTCA GCTCGTGGTT CTGCAGCTCG 45720 

CACCACGCGA CGGCGATGCG CCCCAGCATG TCGTTCACGT GGCGCTGTAT GTGGTTATAC 45780 

GTAAACTGCA GCCGGGCGAA CTCGATCGAG GAGGTGGTCT TGATGCGCTC CACGGACGCG 45840 

TTGGCGCTGG GCGCCTCCCG CAGTGGCGCG GGCGTGGCAT TCCGGGGCTT GCGGTCCTGC 45900 

TCCCGCATGT ACTCCCGCAC GTACAGCTCG GCGAGCGTGT TGCTGAGGAG GGGCTGGTAC 45960 

15 GCGATGAGGA AGCCCCCCGT GGCCAGGTAG TACTGCGGCT GGCCCACCTT GATGTGCGTG 46020 

GCGTTGTACT TGCGCGCAAA CATGCGGTCG ATGGCCTCGC GGGCATCCCG GCCAATGCAG 4 6080 

TCGCCCAGGT CGACGCGCGA GAGCGAGTAC TGGGTCAGGT TGGTGGTGAA GGTGGTCGAG 46140 

ATGGCGTCGG AGGAGAAGCG GAAGGAGCCG CCGTACTCGG CGCGGAGCAT CTCGTCCACC 46200 

TCCTGCCACT TGGTCATGGT GCAGACCGCC GGTCGCTTCG GCACCCAGTC CCAGGCCACG 46260 

20 GTAAACTTGG GGGTCGTCAG CAAGTTGCGG GTCGTCGGCG ACGTGGCCCG GGCCTTCGTG 46320 

GTGAGGTCGC GCGCGTAGAA GCCGTCGACC TGCTTGAAGC GGTCGGCGGC GTAGCTGGTG 46380 

TGCTCGGTGT GCGACCCCTC CCGGTAGCCG TAAAACGGGG ACATGTACAC AAAGTCGCCC 46440 

GTCGCCAACA CAAACTCATC GTACGGGTAC ACCGACCGCG CGTCCACCTC CTCGACGATG 46500 

CAGTTGACCG TCGTGCCGTA CCGATGGAAC GCCTCCACCC GCGAGGGGTT GTACTTGAGG 46560 

25 TCGGTGGTGT GCCACCCCCG GCTCGTGCGC GTGGCGACCT TCGCCGGCTT GAGCTCCATG 46620 

TCGGTCTCGT GGTCGTCCCG GTGAAACGCG GTGGTCTCCA TGTTGTTCCG CACGTACTTG 46680 

GCCGTGGAGC GGCAGACCCC CTTGGCGTTA ATCTTGTCGA TCACCTCCTC GAAGGGAACG 46740 

GGGGCGCGGT CCTCGAATAT CCCCATAAAC TGGGAGTAGC GGTGGCCGAA CCACACCTGC 46800 

GACACGGTCA CGTCTTTGTA GTACATGGTG GCCTTGAATT TGTACGGGGC GATGTTCTCC 46860 

30 TTGAAGACCA CCGCGATGCC CTCCGTGTAG TTCTGCCCCT CCGGGCGCGT CGGGCAGCGG 46920 

CGCGGCTGCT CAAACTGCAC CACCGTGGCG CCCGTCGGGG GCGGGCACAC GTAAAACTGG 46980 

GCATCGGCGT TCTCGACCTT GATTTCCCGC AGGTGCGCGC GCAGCGTGGC GTGGCCGGCG 47040 

GCGACGGTCG CGTTGGCGTC GGGGGGCGGG GTCGCCTCGG GCCGCTTGGG CGGCTTTTTG 47100 

GTTTTCCGCT TCCGGGCCTT GGTGGTCGCG GGGCTCGGGA CGGGGGGCGG CCGGGAGGCG 47160 

35 GGACCCCCGT TCGCCGCGAC GGTCGCGGCC ACGCCGCCCG AGGCGCGGGG GGCCGCCGGG 47220 

GCCGCCGGGG CCGCCGACGC CACCGCGGCC ACCAGCGCCC CCACGACCAG CGCGCAAATC 47280 

AAGCCCCCCC CGCGCATGGC GGGCCTACGG GGGCGCGTCG CTCCCGCCGC CCGCTAGTCT 47340 

GGGGGCGAGG TGCTGCAGGA CCGAGTAGAG GATGGAAAAA ACGTCTCGGT CGTAAACCAC 47400 

GACCGAGCGG GGTCCGATGC AGCCGTCGGG GCCGCTCTCG ACGATGGCCA CCAGCGGACA 47460 

40 GTCGGAGTTG TACGTGAGGT ACACGCCCGG CGGGTAGCGG TACAGACCTT CGGAGGTCGG 47520 

GCGGCTGCAG TCGGGGCGGC GCAACTCAAG CTCCCCGCAC CGGTAGACCG ACGCAAAGAG 47580 

TGTGGTGGCG ATAATGAGCT CGCGAATATA TCGCCAGGCG GCGCGCTGGG TGGGCGTGAT 47640 

TCCGGAAACA CCGTCAAAAC AGTAGAACTT TTGAAACTCG CTGACGGCCC AATCAGCGCC 47700 
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CGAACCCCCC GCGCCCATGA TGAAGCGGGC GAGTTCCTCC TTGAGGTGCG GCAGGAGCCC 47760 

CACGTTCTCG ACGCTGTAGT ACAGCGCGGT GTTGGGGGGC TGGGCGAAGC TGTGGGTGGA 47820 

GTGGTCGAAC AGGGGCCCGT TGACGAGCTC GAAGAAGCGA TGGGTGATGC TGGGGAGCAG 47880 

GGCCGGGTCC ACCTGGTGGC GCAGCAGCGA CGCTCGCATG AACCGGTGCG CGTCAAACAC 47940 

5 GCCCGGGGCG GCGCGGTTGT CGATGACCGT GCCCGCGCCC GCCGTCAGGG CGCAGAAGCG 48000 

CGCGCGCGCC GCGAAGCCGT TGGCGACCGC GGCGAAGGTC GCGGGCAGCA CCTCGCCGTG 48060 

GACGCTGACC CGCAGCATCT TCTCGAGCTC CCCGCGCTGC TCGCGCACGC AGCGCCCGAG 4 8120 

GCTGGCCAGC GACCGCTTGG TCAGGCGGTC CGCGTACAGC CGCCGGCGCT CCGGCACGTC 48180 

CGCGGCGGCC CGCGTCGCGA TGTCGCCCCA GCTCTCCGGC CCCTGCGCCC CTGGCTCGGG 48240 

10 GCCGCGCTCC CCGTCCTCGC TCGCGGGCGT CCCCGCGCCA CGCCTCCGCC CCCCCTCCTC 48300 

CGCGGCGGCC CGGGGCTCTT CCTCCTCGGC CCCCCCGGTC GCGCCGCCGG CCCCCAGCCG 48360 

CGCCAGCACG CGGCGCAGCG CCTCCTCGTC GCACTGCTCG GGGCTGACGA GCCGCCGCAG 48420 

CAGCGGCGTC GTCAGGTGGT GGTCGTAGCA CGCGCGTATC AGCGCCTCGA TCTGATCGTC 48480 

GGGCGACGTC GCCTGGCCGC CGATGATCAG GGCGTCCACC ATGTCCAGCG CCGCCAGGTG 48540 

15 GCCCCCGAAC GCGCGATCGA AGTGCTCCGC CCGCCGCCCG AACAGCGCCA GCTCCACGGC 48600 

CACCGCGGCG GTCTCCTGCT GCAGCTCGCG CTGCGCCAGC GCGTTCAGGT TGTCGGCGAA 48660 

GGCGTCCATG GTGGAGTGGC GGGCGCGATC GCCGGACGCC AGCCAGAAGC GCAGCTCGCT 48720 

GATGGCGTAC AGGCCGGGCG TAGTGGCCTG AAACACGTCA TGCGCCTCCA GCAGGGCGTC 48780 

GGCCTCCTCG CGGACAGAAG AGCTATCGGC GGGCGGCGGG CCGGCCCGGG CCCCGCCGCC 48840 

20 CGCCGCGGTC CGCGCCAGCG CCTGGTCCAG CACACAGAGC GCTCGCGCGC GGGCGGCGTC 48900 

CGACAGCCCG GCGGCGTGGG GCAGGTACCG TCGCAGCTCG TTGGCGTCCA GCCGCACCTG 48960 

GGCCTGTTGG GTGACGTGGT TACAGATGCG GTCCGCCAGG CGGCGGGCGA TGGTCGCCCC 49020 

CTGGTTCGCG GTGACGCACA GCTCCTCGAA ACAGACCGCG CACGGGTGGG ACGGGTCGCT 49080 

CAGCTCCGGG GGCACGATGA GGCCCGACCC CACCGCCGCC ACCATAAACT CCCGGACGCG 49140 

25 CTCCAGCGCG GCCGTGGCGC CGCTCGGGGG GGTGATGAGG TGGCAGTAGT TCAGCTGCTT 49200 

GAGAAAATTC TCGACATCAT GCAGGAAGCA CAGCTCCATG CGGACGTCCC CGCCGTACGT 49260 

CTCCAGCCGG ATCTGCTGGT GGTACGGACA GGGTCGGGCC AGACCCATGG TCTCGGTGAA 49320 

AAAGGCAGAG ACGTCACCCG TGGTCGCGAA CGTTTCCAGG TGGCCCAGGA GCCGCTCCCC 49380 

CTCGCGCCAC GCGTACTCCA GGAGCAACTC CAGGGTGACC GACAGCGGGG TGAGAAAGGC 49440 

30 GGCGGCCTGA GCCTCCAGCC CCGGCCGCAG GTGCCGCCGC AGCACGCGCA CCTGGAGCGC 49500 

GTTGAGCTTT AGCTGGGCGA GCTTCCCCAG GCCGATCTGG GGGTCGCATC GTCGAAGCAG 49560 

CTCTAGCTGA AAAACGTACG TCTGTACCTG CCCGAGCAGG GCCAACAGTT TCTGTCGGGC 49620 

CGCAGTGGGC TCGGAAACCG CGGCCGGGGG CGCGGCCGCC ATGGCGAGTC GCCCGGCCGT 49680 

GCTGTGGTTT AGTTAAGGTT TGGGGGGGTG GGTCAGAGGC GCGCCCCGCG CGGACTGATG 49740 

35 CGGCGGCGGG CCCCTGACAT CCCCTCTTTA TGCCCGTCGC CCGCCCGCCC GCCCCGCCGG 49800 

TGTGCCGTGA TTCGCGGAGT CGGGGCCTTG TGTTTCTTTC TTTCCCCCCC CGAATCCGTT 49860 

CTTTCTTCCT CACCCCCCCT CCCCACACAC CCACCCAGGA CTCGCCACCA CAAGGAGGCG 49920 

AGAGCCCGTC GCTAACCCAA AGACACAGTC ACGAGACACG ATATCGACTG TAGTTGCGAT 49980 

CGTTTATTTT ATACACAACA CCAACCTTTC CTTCGACCCC CCCCACCCCC GCCCCTAGAG 50040 

40 CATATCCAAC GTCAGGTCCT TTTTCTCCGG TGGTCCCTCC CCAAACGGAT CGTCGCCGTG 50100 

AAACGCCCGC TTTCGGGCGA CGCCGGCCGC CCCCGCCGCC GCCGCCAAAC CGCCGAACGA 50160 

CGCCGCGTGG TCATCCTCGT CGCCGAAATC CCCAAAGTTA AACACCTCCC CGGCGGCGCC 50220. 

GAGCTGGCTG ACCAGGGCCT CCGCCTCGTG GGCCACCTCC AGGGCCGCGT CGGTCGACCA 50280 
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CTCGCCGTGC CCGCGCTCCA GGGCGCGGGT GGTAAACTCC ATCATTTCCT CGCTCAGGTA 50340 

CTCGTCCTCC AGCAGCGCCA GCCAGTCCTC GATCTGCAGC TGCTGGGTGC GGGGGCCCAG 50400 

GCTCTTGACG GTCGCCACAA ACACGGTGCT GGCGACCGCC GCCCCGCCCT CCGCAATGAT 50460 

GCCCCGGAGC TGCTCGCACA GCGAATGCTC GTGGGCCCCG CCCCCGAGAC TCGACGCCGC 50520 

5 GCACACAAAC CCGGCCCTGG GGCAGGCCAG GACAAACTTG CGGGTGCGGT CAAAGATCAG 50580 

CAGCGGGCAC GCGTTTTTGC CGCCCAGCAG GCTGGCCCAG TTCCCGGCCT GAAACACGCG 50640 

GTCGTTGCCG GCCATGCCGT AGTATTTGCT GATGCTGAGG CCCAGCACGA CCATCGGGCG 50700 

CGCGGCCATC ACGGGCCGCA GCAGGTTGCA GCTCGCGAAC ATGGACGTCC AGGCGCCGGG 50760 

GTGCGCGTCG AGGGAGTCCA TCAGCGCGCG GGCCCCGGCC TCCAGGCCCG CGCCGCCCTG 50820 

10 CGGGGCCCAG GCGGCGGCCG CCTGCACGCC GGGGGGACGG CGGGACCCGG CGATGACGGC 50880 

CGTGAGGGTG TTTATGAAGT ACGTCGAGTG GTCGCAGTAC CTCAAGATCT GGTTGGCCAT 50940 

GTAGTACATG GCCAGTTCGC TCACGTTATT GGGGGCCAGG TTGATAAAGT TAATCGCGCC 51000 

GTAGTCCAGG GAGAACCTCT TAATGAACGC GATGGTCTCT ATGTCCTCGC GCGACAAGAG 51060 

CCGGGCGGGG AGCTGGTTGC GCTGGAGGGC GGTCCAGAAC CACTGCGGGT TCGGCTGGTT 51120 

15 CGACCCCGGG GGCTTGCCGT TGGGAAAGAT GACCGCGTGG AACTGCTTCA GCAGGAAGCC 51180 

CAGCGGTCCG AGGAGGATGT CCACGCGCTT GTCGGGCTTC TGGTAGGCGC TCTGGAGGCT 51240 

GGCGACCCGC GCCTTGGCGG CCTCGGACGC GTTGGCGCTC GCGCCCGCGA ACAACACGCG 51300 

GCTCTTGACG CGCAGCTCCT TGGGAAACCC CAGGGTCACG CGGGCAACGT CGCCCTCGAA 51360 

GCTGCTCTCG GCGGGGGCCG TCTGGCCGGC CGTTAGGCTG GGGGCGCAGA TAGCCGCCCC 51420 

20 CTCCGAGAGC GCGACCGTCA GCGTCTTCGC CGACAGGAAC CCGTTGTTGA ACAGGTCCAT 51480 

GACGCGCCGC CGCAGCACCG GTTGGAATTG ATTGCGAAAG TTGCGCCCCT CGACCGACTG 51540 

CCCGGCGAAC ACCCCGTGGC ACTGGCTCAG GGCCAGGTCC TGGTACACGG CGAGGTTGGA 51600 

CCGCCGCGCG AGGAGCTGCA GCAGGGGGCA CGGCCCGCAG GTGTACGGGT CCAGCGACAG 51660 

CGACATGGCG TGGTTGGCCT CGGCCAGACC GTCGCGGAAC TTAAACCAGC TGCTTGATGT 51720 

25 TGTTCACCAC CGTGTGCAGG GCCTCGCGGG TGCCGATAAT CGTCTCCAGC CTCCCCAGGG 51780 

CCGTGGGCAC CGCCTGGTCC ACGTACTGCA GGGCCTCGAG CTCGGCCATG ACGCGCTCGG 51840 

TGGCCGCGCG GTACGTCTCC TGCATGATGG TCCGGGTGTT CTCGGACCCG TCCGCGCGCT 51900 

TCAGGGCCGA GAAGGCGGCG TAGTTCCCCA GCACGTCGCA GTCGCTGTAC GCGCTGTTCA 51960 

TCGTTCCGAA GACCCCAATG GCCCCCCGGG CGGCGCTCGC GAACTTGGGG TGGCGGGCCC 52020 

30 GCAGCCGCAT CAGCGTCGTG TGCGCGCAGG CGTGGCGGGT CTCGAAGGTA CACAGGTTGC 52080 

AGGGCACGTC GGTCTGGCCC GAGTCCGCGA CGTAGCGAAA CACGTCCATC TCCTGGCGCC 52140 

CGACGATGAC TCCGCCGTCG CAGCGCTCCA GGTAAAACAG CATCTTGGCC AGCAGGGCCG 52200 

GAGAGAACCC GCACAGCATG GCCAGGTGCT CGCCGGCGAA CTCCTGGGTT CCGCCGACGA 52260 

GGGGCGCCGT GGGGCGCCCC TCGTACCCGG GCACCACGTG GCCCTCGCGG TCCAGCTGCG 52320 

35 GGTTGGCCGC CACGTGCGTG CCGGGCACGA GAAAGAAGCG GTAAAAGGAG GGCTTGCTGT 52380 

GGTCCTTGGG GTCCGCCGGC CCGGCGTCGT CCACCTCGGT CAGGTGGAGG GCCGAGTTGG 52440 

TGCTGAACAC CATGGCGCCC ACGAGGCCCG CGGCGCGCGC CAGGTACGCC CCGACGGCGC 52500 

CGGCACGGGC CGCGGGCGTT TCCTGGCCCT CAAGCAGGGG CCACGTGGTG ATGTCGGGGG 52560 

GCGGCTCGTC AAAGACCGCC ATCGACACGA TGGACTCCAG GGCCAGGGCG GCGTCGCCCG 52620 

40 CCATCACCGA GGCCAGGCGC TGCTCAAACC CGCCCGCCGG GCCCTTGTTC CCGGCGTCGC 52680 

GCGCGCCCCG CTGGGGCTTA CCCTGGCTGG CCTCGAAGGC CGTGAACGTA ATGTCGGCGG 52740 

GGAGGGCCGC GCCCTCGTGG TTTTCGTCGA ACGCCAGGTG GGCGGCCGCG CGGGCCACGG 52800 

CGTCCACGTT CCGAGCACGC AGGGCCACGG CGGCGGGCCC GACGACCGCC TCGAACAGCA 52860 
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GGCGGGCGAG GGGGCGGTTG AAAAACGGAA 
GGTGGTTGCA GTTAAACGGA TCGGCGATGA 
GCGGATACAC GGGGATGCGG TGAACCTCCG 
CCAGGTGCAG GAAGGTGTTG CTGATGCACA 
5 GATACAGCAA GGCCCGGTCC GGGTCCAGTC 
TCTCGTGCTT TAGGTCGCAG GGCCGGGGCG 
CCCGCTCGCA GAGCCGCGTC AGGTTGGGGG 
CGTGAAAGAC GTAGACGGAC GGGCTGTAGT 
TCCCCCCAAG GCCCGTCGTG CGGGACCCGA 

10 TCTCCACGGT CAGGCCGACG ATGAGGGGCG 
CCGACAGTAG CGACAGCAGC TCCAGGCCTT 
CCATCGGCCC CGGAGGAACC TTGACGGTGG 
TCGGGAGATT GGCGACCGGC AGGAACGGGG 
GAGGCCGCGC GTGGTCGACG GCTGCTGCCC 

15 GGCGCTGGGG GTGGGGTCTA CACCCGCCCG 
TGGGTGGGAT GGGGTGGGCG AGAATGGCCC 
GGGGTTGGGC AAGGTTTGGG CGCAAGGCTC 
GGCCCAGAGC TGGGTATGCT CGGCCGGGGC 
GCGGCGTCGG GCCCCGCCCA CGGTCCGCCA 

20 CCGCCCTTCT AAAAAAAGTG AGAACGCGAA 
ATTATTAGGA CAAAGTGCGA ACGCTTCGCG 
CCCCTTTGAC GTCACGCTCA CCCGGGCGGC 
GATAAAAAGA AACCGCGGCG CCCCCGCGGA 
CGCAGAAGGG ACCCGGGCGC GGGTCCGCCG 

25 ATCCCACCCC GAGCTGTTGG GTGGGCGGGT 
GGCGGCGTAT AGCAGGACAA CGACCGGCGG 
CCCCCGGGGG GAAGTCGGCG GCTCGGGCGG 
GGGGAGCCAC CCAGACGGCA CCGCCGCCTT 
TCGCTCAGAC CGGAACGCAG CCAAAGGCCC 

30 GCGAGTGCGA CGAATTTCGA TTTATCGCCC 
AGCAGCGCAC CGGGGTCCAC GACGGCCGCC 
GGGACGAGCG CGACGTCCTC CGCGTGGGCC 
TGTGGGGCGG TGCGGACCAT GCCCCCGAGG 
TGTACGACAT CCTGGAGCAC GTGGAACACG 

35 AGCGATTTAT GGACGCCATC ACGCCCGCCG 
CCGAAGGCCA TCGCGTCGCC GTTCACGTCT 
AGGCAGAGGT GGATCGGCAC CTGCAGTGCC 
CGGCGGCCCT GCGCGAGTCG CCGGGGGCGT 
AGGCGGAGGT GGTGGAGCGC GCCGACGTGT 

40 ACCGCGTCTT CGTGCGAAGC GGGCGCGCGC 
CGATCAGGAA GTACGAGGGG GGCGTCGACG 
GGTTTGTCAC CTTCGGCTGG TACCGCCTCA 
CGCGCCCCCC GACGGCGTTC GGAACCTCGA 
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GGGGGTAGTT 


GAAATTCTCC 


CCGATCGATC 


52920 


CCCGGCTAAA 


ATCCGGCATA 


AACATCTGCA 


52980 


CGTCCCCGAT 


GGTTACCTTG 


TCCATCCCGC 


53040 


CGGCCTCCCG 


GAAGCCCTCC 


GTGATCACCA 


53100 


CGAGCCGCTC 


GCACAGCGCG 


TCCCCCGTCG 


53160 


CGTAGTCCGA 


GAAGCCAAAA 


TGGCGGCGCG 


53220 


CCTGGGTGCT 


GGGGGCCAGG 


TGGCGGCCGC 


53280 


GCGAGGGCAT 


AAGCTTGAGG 


GACACCGCGG 


53340 


CGACCGCGGC 


CACGTTGGCC 


TCAAACCCGC 


53400 


CGACGGCGAC 


GTCCGCGTCG 


CCGCTGCGCG 


53460 


CGGCCGGACA 


GGCGCGGCCA 


TACACGTACC 


53520 


TCGTCGTTTT 


GGGCTTGGTG 


TCCATGGCTT 


53580 


GCCCGGCAAG 


ACGACCGGGG 


GCAGACGGGG 


53640 


GCCGTCGTCT 


CTCCGATGGG 


GTCGAATGCC 


53700 


TTCACCGAGC 


GGCCCCTGGT 


GGGGGTGGGA 


53760 


GCCACCGGAT 


CGCGCCGGAC 


GGGGGGGCCC 


53820 


CAGCGGCGAT 


TCGAGAGGCC 


TGCGGATGGC 


53880 


GGCCGGTATA 


TGTACGGCGT 


GCTGGGAGGG 


53940 


CGCCCCGCGC 


GTCATCGGCA 


GGGGGCGTGG 


54000 


GCGTTCGCAC 


TTTGTCCTAA 


TAGTATATAT 


54060 


TTCTCACTTT 


TTTTAGAAGG 


GCGGCCACGC 


54120 


CGGCCGCCCA 


TAAGCGCGGC 


CTGCCGGGCC 


54180 


CACCACACAC 


TGGCTCTCGA 


ACCCCGGACG 


54240 


GTAAGAGCCG 


GGGGGAACAT 


CGGCACCGCC 


54300 


GGGGGGGCTG 


GTGAGGCGGT 


GGTGGGAGGG 


54360 


CGATGTTTTG 


TGCCGCGGGC 


GGCCCGACTT 


54420 


CGTCTGGGTT 


TTTTGCCCCC 


CACAACCCCC 


54480 


GCCGCCGGCA 


GAACTTCTAC 


AACCCCCACC 


54540 


CCGGGCCGGC 


TCAGCGCCAT 


ACGTACTACA 


54600 


CGCGTTCGCT 


GGAGGAGGAC 


GCCCCCGCGG" 


54660 


TCCGGCGCGC 


CCCTAAGGTG 


TACTGCGGGG 


54720 


CGGAGGGCTT 


CTGGCCGCGT 


CGCTTGCGCC 


54780 


GGTTCGACCC 


CACCGTCACC 


GTCTTCCACG 


54840 


CGTACAGCAT 


GCGCGCCGCC 


CAGCTCCACG 


54900 


GGACCGTCAT 


CACGCTTCTG 


GGTCTGACCC 


54960 


ACGGCACGCG 


GCAGTACTTT 


TACATGAACA 


55020 


GTGCCCCGCG 


CGATCTCTGC 


GAGCGCCTGG 


55080 


CGTTCCGCGG 


CATCTCCGCG 


GACCACTTCG 


55140 


ACTATTACGA 


AACGCGCCCG 


ACCCTGTACT 


55200 


TGGCCTACCT 


GTGCGACAAC 


TTTTGCCCCG 


55260 


CCACCACCCG 


GTTTATCCTG 


GACAACCCGG 


55320 


AGCCCGGCCG 


CGGGAACGCG 


CCGGCCCAAC 


55380 


GCGACGTCGA 


GTTTAACTGC 


ACGGCGGACA 


55440 
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ACCTGGCCGT CGAGGGGGCC ATGTGTGACC 
TCGAATGCAA GGCCGGGGGG GAGGACGAGC 
ACCTCGTCAT CCAGATCTCC TGTCTGCTCT 
TCCTCCTGTT TTCGCTCGGA TCCTGCGACC 
5 CCAGGGGCCT GCCGGCCCCC GTCGTCCTGG 
CCTTCATGAC CTTCGTCAAG CAGTACGGCC 
ACTTCGACTG GCCCTTCGTC CTGACCAAGC 
GGTACGGGCG CATGAACGGC CGGGGTGTGT 
TTCAGAAGCG CAGCAAGATC AAGGTGAACG 

10 TCACCGACAA GGTCAAACTC TCCAGCTACA 
AGGAC AAGAA GAAGGATCTG AGCTACCGCG 
CGCAGCGCGG GGTGATCGGC GAGTATTGTG 
TCTTCAAGTT TCTGCCGCAC CTGGAGCTTT 
TCACCCGCAC CATCTACGAC GGCCAGCAGA 

15 CGGGCCAGAA GGGCTTCATC CTGCCGGACA 
AGGCGCCCAA GCGCCCGGCC GTGCCTCGGG 
GGGACGAGGA TAAGGACGAC GACGAGGACG 
TCGCGCGCGA GACCGGGGGC CGGCACGTTG 
CCTCCGGGTT TCACGTCGAC CCCGTGGTGG 

20 TCATCCAGGC CCACAACCTG TGCTTCAGTA 
ACCTGGAGGC GGACCGGGAC TACCTGGAGA 
TGAAGGCCCA CGTACGCGAG AGCCTGCTGA 
GAAAGCAGAT CCGCTCGCGG ATCCCCCAGA 
AGCAACAGGC CGCCATCAAG GTGGTGTGCA 

25 ACGGTCTTCT GCCCTGCCTG CACGTGGCCG 
TCCTCGCGAC GCGCGCGTAC GTGCACGCGC 
ACTTTCCGGA GGCGGCCGGC ATGCGCGCCC 
GGGACACGGA CTCCATTTTC GTTTTGTGCC 
TGGGCGACAA GATGGCGAGC CACATCTCGC 

30 AGTGCGAAAA AACGTTCACC AAGCTGCTGC 
TCTGCGGGGG CAAGATGCTC ATCAAGGGCG 
TTATCAACCG CACCTCCAGG GCCCTGGTCG 
GAGCGGCCGC CGCGTTAGCC GAGCGCCCCG 
AGGGACTGCA GGCGTTCGGG GCCGTCCTCG 

35 AGAGGGACAT CCAGGACTTT GTCCTCACCG 
CCAACAAGCG CCTGGCCCAC CTGACGGTGT 
TCCCGTCCAT CAAGGACCGG ATCCCGTACG 
AGACGGTCGC GCGGCTGGCC GCCCTCCGCG 
CCGCCCCCCC AGCGGCCCTG CCCTCCCCGG 

40 CCGACCCCCC GGGAGGCGCG TCCAAGCCCC 
ATCCCGGGTA CGCCATCGCC CGGGGCGTTC 
TGCTGGGGGC GGCCTGCGTG ACGTTCAAGG 
AGAGTCTGTT AAAGAGGTTT ATTCCCGAGA 
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TGC C GGCCTA 


CAAGCTCATG 


TGCTTCGATA 


cccnn 
jjjuU 


TGGCCTTTCC 


GGTCGCGGAA 


CGCCCGGAAG 




ACGACCTGTC 


CACCACCGCC 


pfpnp 7v pn tv /*• tv 
CTCGAGCACA 


DbbzU 


TCCCCGAGTC 


C CACC TC AGC 


GATCTCGCCT 


bbbol) 


AGTTTGACAG 


CGAATTCGAG 


ATGCTGCTGG 


35740 


CCGAGTTCGT 


GACCGGGTAC 


7v tv tv m/~« TV m/™« tv 

AACATCATCA 




TGACGGAGAT 


CTACAAGGTC 


CCGCTCGACG 


55860 


TCCGCGTGTG 


f-T«""< TV V» TV m/-v/-"i 

GGACATCGGC 


CAGAGCCACT 


55920 


GGATGGTGAA 


CATCGACATG 


TACGGCATCA 


55980 


AGCTGAACGC 


CGTCGCCGAG 


GCCGTCTTGA 


56040 


ACATCCCCGC 


CTACTACGCC 


TCCGGGCCCG 


56100 


TGCAGGACTC 


GCTGCTGGTC 


GGGCAGCTGT 


56160 


CCGCCGTCGC 


GCGCCTGGCG 


GGCATCAACA 


56220 


TCCGCGTCTT 


CACGTGCCTC 


CTGCGCCTTG 


56280 


CCCAGGGGCG 


GTTTCGGGGC 


CTCGACAAGG 


56340 


GGGAAGGGGA 


GCGGCCGGGG 


GACGGGAACG 


56400 


GGGACGAGGA 


CGGGGACGAG 


CGCGAGGAGG 


56460 


GGT AC C AGGG 


GGCCCGGGTC 


CTCGACCCCA 


56520 


TGTTTGACTT 


TGCCAGCCTG 


rri TV /~l/^/~«/-1Tk /^l TV 

TACCCCAGCA 


56580 


CGCTCTCCCT 


GCGGCCCGAG 


GCCGTCGCGC 


56640 


TCGAGGTGGG 


GGGCCGACGG 


CTGTTCTTCG 


56700 


GCATCCTGCT 


GCGCGACTGG 


CTGGCCATGC 


56760 


GCACCCCCGA 


GGAGGCCGTC 


CTCCTCGACA 


56820 


ACTCGGTGTA 


CGGGTTCACC 


GGGGTGCAGC 


56880 


CCACCGTGAC 


GACCATCGGC 


CGCGAGATGC 


56940 


GCTGGGCGGA 


GTTCGATCAG 


CTGCTGGCCG 


57000 


CCGGTCCGTA 


CTCCATGCGC 


ATCATCTACG 


57060 


GCGGCCTCAC 


GGCCGCGGGC 


CTGGTGGCCA 


57120 


GCGCGCTGTT 


CCTCCCCCCG 


ATCAAGCTCG 


57180 


m/"nv m/*n ta x 

TCATCGCCAA 


J"**1 TV TV TV TV TV /"ifTlTJV /*"» 

GAAAAAGTAC 


ATCGGCGTCA 


C *7*7 A t\ 

57240 


TGGATCTGGT 


/-^/T/^/l TV TV TV TV /T 

GCGCAAAAAC 


tv tv /^m/^/~i/~»/-»/-»rTT 

AACTGCGCGT 


57300 


ACCTGCTGTT 


mm TV /■"*/"> T\ f**r** tv m 

TT ACG AC GAT 


ACCGTATCCG 




CAGAGGAGTG 


GCTGGCGCGA 


CCCCTGCCCG 


5 /4z0 


TAGACGCCCA 


TCGGCGCATC 


ACCGACCCGG 


5 /4o0 


CCGAACTGAG 


C AGAC AC CCG 


CGCGCGTACA 


5 /540 


ATTACAAGCT 


CATGGCCCGC 


CGCGCGCAGG 


57600 


TGATCGTGGC 


CCAGACCCGC 


GAGGTAGAGG 


57660 


AGCTAGACGC 


CGCCGCCCCA 


GGGGACGAGC 


57720 


CCAAGCGCCC 


CCGGGAGACG 


CCGTCGCATG 


57780 


GCAAGCTGCT 


GGTGTCCGAG 


CTGGCGGAGG 


57840 


CGCTCAACAC 


GGACTATTAC 


TTCTCGCACC 


57900 


CCCTGTTTGG 


AAATAACGCC 


AAGATCACCG 


57960 


CGTGGCACCC 


CCCGGACGAC 


GTGGCCGCGC 


58020 


548 
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GGCTCAGGGC CGCGGGGTTC GGGCCGGCGG GGGCCGGCGC TACGGCGGAG GAAACTCGTC 58080 

GAATGTTGCA TAGAGCCTTT GATACTCTAG CATGAGCCCC CCGTCGAAGC TGATGTCCCG 58140 

CATCTTGCAA TAAATGTCTG CGGCCGACAC GGTCGGAATT TCCGCGTCCG CTGGTTTCTC 58200 

TGCGTTGCGT CTGACCACGA GCACAAACGT GCTCTGCCAC ACGTGGGCGG CGAACCGGTA 58260 

5 GCCGGGGCAC GCGGTCAGCA TCCGATCGAT GAGCCGGTAG TGCAGGTGGG CCGACGTGCC 58320 

GGGGAAGATG ACGTACAGCA TGTGGCCCCC GTACGTGGGG TCCGGGTAAA AAAGAAACCG 58380 

GGGGTCGCAC GCCCCCCCTC CGCGCAGGAT CGTGTGCACG AAAAAGAGCT CGGGCTGGCC 58440 

GAGCGTATCG GCCAGGAGGT CCTGGAGGGG GGTGCTGTGG CGGTCGGCCA GCACGACCAG 58500 

GGAGGCCAGA AAGGTGCGGT GCTCAAAGAT CGTATTGATC TGCTGCACGA AGGCCAGGAT 58560 

10 GAGGGCCTCG CGGCTGACGG TGGCCAGCCG CCCGTCGCCC GCGCTGCACG CGGGGCAGCA 58620 

GCCCCCGATC CCCAGGTAGT AGCCCATGCC CGAGAGGGTC AGGCAGTTGT CGGCCACGGT 58680 

CTGGTCCAGG CTGAAGGGGA GCGACACGGG GGTCGTCTTC ACCAGGGGCA CGGATAGCGA 58740 

GCGCACGATG GCGATCTCCT CGGAGGGCGT CTGGGCGAGG GCGGCGAAGA AGCCGCGGTA 58800 

GCGACGGCGC TCGTGCAGGC AGAGCTCCAG CCTGCGCGCG TGCGACGGCA GGCTCTTGCG 58860 

15 GGAGGCCCGG CGCTCCACGC CGGGGTTCCC GGCGGCGGAA AAGCGCGACC GCCGCCGGGT 58920 

CTTGTCGCGG CCGGGCCCGG GCCGGGAGCC GGAGCGACGG GGGGCGATGT CATACATAGG 58980 

TACAGAGGGT GTGCTCCAGG GACAGGAGAG AGATCGAGTG TCGTCTGAGC AGCGCGCCGG 59040 

CCTCGCGGAC AAATGTGGCC AGCGCGGTGG GCTTCGGCAC AAATACCTGG TACGTCTTGA 59100 

AGGTGTAGAT GAGGGCCCGC AGGGCTATAC AGACCCGCCC CTCGAACTCG TTGCCGCAGG 59160 

20 CCAACTTGGC CTTGTGAAGC TGCAGCTCGT CGCGATGGTC GGCGCGGGGG TGGCCAAACA 59220 

GGACCCAGGG GTCGACTTCC ATCTCCGTGA TGGCGCACAT GGGATCGCAG AACATGTGCT 59280 

TGAAGATGGC CTCGGGGCCC GCGGCCCGAA GCAGGCTCAC GAACCGGCCC CCGTCCCCGG 59340 

GCTGCGCCTC GGGGTCCGCC TCGAGCTGGT CCACGACCGG CACTATGCAG TCGAAGAGGC 59400 

TGGTGTTGTT CTCCGAGTAG CGGACGACGG ACGCCCTCAG GCGTCGCATG GCCAGCCAGT 59460 

25 AGGCCCGCAC CAGCAACAGA TTGCACAGCA GGCATTCCCC GCCGGTGCGC CCGCGGCCCC 59520 

GGCCGTGCTT CAGCACGGTG GCCATCAGCG GGCCCAGGTC CAGGTCGGGC TGGGCCTTGG 59580 

GCTCGGCGAA CTGCGCAAAG CGCGGGGCCG CGTCGCGCAT GCGCGCCCCG CGGTGCGCTT 59640 

CCCAGGACTC GCTGACCGCG GCGCGGCGGG CGTCCGCGGC GGCGCGCAGC CGGGGCCCCG 59700 

ACTCCCAGAC GGCGGGGGTG CCGGCGAGCA GCAGCAGGAT CAGGTCGGCG TACGCCCACG 59760 

30 TCTCCGGCTC ACCCCCCTGC GCCAGCGCCC CGGCGGCGGC CTCGAACTCC CCGTTGCGGG 59820 

. CGGCGGCGCG CGTGCAGCAG CTGTCTCCGC CCCCGCGCTT GCCCTCGGTG CAGTCGAGCA 59880 

GGCGGGCGCA GTCCTTCCAG TTCATCAGGG CGGTGGTGAG GGAGGGTTGC GTTCCCGAGC 59940 

CCCCGCCCGC CCCCGCCCCG TCATCGCCCC CGGAGGCCAG GGTCCCGATG AGGGCCCGGG 60000 

TTGCGGACTG CGCGAGGAAG GAATAGTTGG AGTACTGCAC CTTGGCGGCG CCCGGGGAGG 60060 

35 GCGTCGGCCT GGGTTGCTTC TGGGCGTGGC GCCCGGGCAC CCCGCCGTCG GTCCGGAAGC 60120 

AGCAGTGGAG AAAGAAATGC CGGTGGATGT CGTTGATGGT CAGGGCGAAG CGCGCGAAGG 60180 

AGCCGACAAG GGTCGCCTTC TTGGTGCGCA GGAAGTGGTG GTCCATGACG TAGACGAACT 60240 

CGAAGGCGGC CACGAAGATG CTCGCGGCGC AGTGGGGCGC GCCCAGGCAC TTGGCGCAGA 60300 

GGAACGCGTA ATCGGCCACC CACTGGGGCG AGAGGCGGTA GGCCTGCTTG TACAGCTCGA 60360 

40 TGGTGCGGCA GACCAGACAG GGGCGGTCCA GCGCGAAGGT GTCGACGGAC GCCGCGGCGA 60420 

AGGGCCCCGT GTCCAAGAGT CCCTCTGCCG TGGGGTCTGC GGGCGGGCCG CGGGCGGACC 60480 

CCGGCCCCCG CCCCCCCGAA GCCTCGCGCG CGGCCCCGCG CGGCCGCGGG GGGGCGGGCG 60540 

CGACGTCGCT CTCCACGTCC TCGTCGAGCG CGCTCGCGGG CGGCACGCCT ACCACGTGAC 60600 

549 
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AGGCCGCCAG GAGCTCGGCG CACAGGGCCT 
CCACATACGG ACGCTCGAAC GCGCCCTCCT 
CGGCGGCGCT CGACGGCACC CCCGGGGCGG 
CGCGTCCGCG AACGTTACGG GACGCGATCC 
5 AAAGTCTAGA CGCGCGCTAC GTCTCGCGAG 
AGGACATGAC CCCCGCCGAA CTAGAGGTTA 
ACCTCTCGCG GACGCAGCGG CTGGCCTCCC 
CCGACGGCCC CGCCGCCCCA CATACGCAGG 
CCCGAAAGCG CGAACGGTTC GCGGCGGTCA 

10 TGCGGGGCTG ACGCGCGCTT CGGCGGGGCA 
CAGTAGGGGG TGGGGGAACG CGCACCCTTG 
CTACGGCGGC CGCCCGGGGG ACGCGTTCGA 
TCCCACCACG CTGCGCGGCG GGGGTGGGGA 
CTCGAGATGT GCCTTCCAGT TCCACGGCCA 

15 GTACGTCCTG CGGCTCATGA ACGACTGGGC 
GCAGAACACC GGCGTTTCGG TGCTGTTTCA 
GGGGGGCGCG ATCACGGCGG AGCAGACCAA 
ACTGTCCCTC GGAGACCTGG ACGACGTCAA 
GATGGCCAGC ATGTGGATCA GCTGCTTTGT 

20 GTTCATGGGC CCCGAGGACG CCGTTCGCAC 
GGCCCTCGCC CGTCGCCGCC GGTCCAGGCG 
GGCGGCGGCG CACCACTCTT CCGGAGCGCC 
AGCGCCGCCC GGACGGGGAC CGGCCCGTCC 
CCCGCGTCCG GGCCCCCCGG CGCTTCTGTT 

25 TATCTGGTGG GCGGTTGGCG CGCGCCTATG 
GTGCATCCCA GACGCCCGCG AGCCGCACAT 
TACGGCGCGA CCCAAGGTCC CGATGGCCGC 
CGCCGACAAC GTCCGGGCGC TCGGCATGCG 
GTTCATCATG GATAACAGCT ACCCGCATCC 

30 TCTTCGCGGG CAGGCCGCGG CGCTGACGGA 
CGCCCCGCAG CCTATGTTCG CGGGCGACGC 
TCTTAAGCGC ACGTATTCCC CCTTTGTCGT 
AGTCCTCGGC GGGTCCCTCC GCGGCCGTCT 
TGGTTCAATA AAAAACACCA ACATACGATA 

35 TAGGGCCCAA CGATCGGCGA TTAACAACAC 
ATGCGCACGT GATGTAGGCT GGTCAGCACG 
GTCCGCTGCA GCTGTTGTTG TATGCGGCGG 
CGACCGGTGC TTCGTACGTA GCGTCGCGAC 
CCAAATTGCG AGTGTGGTGA CTGGAGGTGG 

40 GCGGGGGGCA AGTGCGGTTC CGGTGGGAGG 
GAGAAACGCA GGGAGTCTGC GTCGGAGTGT 
AGCAGCGATG CGGGTGGGGG CGCGGAGTCG 
GTCACAGCGG ACACTGGGAG GTGGGTGTTT 
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CG 1 T. AAGAGC 


papa arriTrr 

CAGAAVjo lib 


ClT* A mpp A A PIP* 
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CGGGTGCAGC 
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61740 


GCGGCGGATC 


CTGTGTCGCG 
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bloUU 
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COO Q A 
oZZ oU 


GCACGGAACG 


CAGGGTGCGG 
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C0*7 HA 


GCGTTGCTGC 


/"» AtnP TV IV TV IV r* 

GCTGAAACAG 


CGCCCTGCGG 


oz /ou 


CATGCGCGGA 


TCAAAACCGC 


CAGGGCGCTA 


62820 


AAGACGGCAT 


TTGCCTGTAC 


GGGCAAGGGG 


62880 


TCGGCGGCCA 


ATGGGCCGGG 


TGGTTCGTCG 


62940 


GGGTCGAGCG 


CCTCGGTATC 


ATCCGAGTCC 


63000 


TCATCATCGG 


AGGAGATGTG 


CAGCGTCTGA 


63060 


ACGTGAAGCG 


CGAGAGAGGA 


AGCCCACGAA 


63120 


GTATGTGTGG 


GAGACTCGGG 


CGTCGGGACC 


63180 


550 
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GAGTCTCGGC 
GGCAGGGGCG 
GGCAGGGGCG 
GGCAGGGGCG 
5 GCGGGTGACT 
AAAGGCTCGC 
GAGGTCGGGG 
GCCGCGTGGT 
GCGACCGCGG 

10 GGCGAGGGAC 
GCTCCTTCGG 
GCCGAGGCGG 
GGGGTGGGCG 
AGTGCCGGCG 

15 CGCGTCCGGC 
GAAAAGGCCG 
CGTGGGTCGT 
GGTCGTGTTT 
GGGACGGACT 

20 CCGCCGGCGT 
GGGACGGGCT 
GCCTGCCGCG 
AGGTCCCGCG 
CGCTGGTAAA 

25 CGCCCGAGGA 
TCCCCCGCGG 
TCAGGGGGGC 
TCCACGAGGA 
AGGCGCGGGA 

30 AGGCCGACGG 
CCGCCGGGGC 
GCCGTGGTGT 
AGGTCCATGG 
ACGTGCTGGC 

35 CCCCCCGCCG 
CTGTCGTACT 
CTCGGCAGAC 
TTCCCGAGCG 
GCCAGCGGGT 

40 AGGAGGCCCC 
TCGGGATGCG 
CGTTCGAGGG 
TAGATGTGTT 
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TCTGGGGCAG GGGCGGCTGG GGCAGGGGCG 
GCTGGGGCAG GGGCGGCTGG GGCAGGGGCG 
GCTGGGGCAG GGGCGGCTGG GGCAGGGGCG 
GCTGGGGCAC CGAGCGCGCG CGGATGCGCG 
GGGGTGGGGG GCGGCGGGCA ACCGGGCCTC 
TCGGGGCAAC CGGGCCTGGG GCCAAAGGCG 
GGGCAAGGCC CGGAGAAGGC GGCACTGCCG 
CGGCTGGGTC CCGGGGAGAG GGGAGGGAGT 
GGCGCGTGAG GCGCCGGGGT GGGCCGGCCG 
CCGCTGTTGT CTGGCGGCGG CCGCGGCGGC 
CGGGCGGAGG CGGGATGGGC GCGAGCGTGG 
GGCCGGGCGG AAGGGGCAAA GCAGAAACCC 
GCTGGTCGGG AGGACGCGCG GAAGCGGCGA 
GACGCCACCC CTCGGGGGGG GCGGAGGCCC 
GGGATCTGCG CACGCGCGGC ACGGCGGCGG 
GGGGAGGAAG CGCGGCATCC GCGGGGGGAC 
CGCGAGGGGC CACGGGCACG CGCCCCGTGT 
CGCGAGCCGT AGCTGCCGGC CCGATGGGCC 
GATCGGTGGC GGGGGGGGGA AGAAGGGCCG 
CGTCGGACGC CAGCTCCTCC AGGCCGTGGA 
CGCCGGTGGT GGCGTCGGTG AGGAGAGTGG 
CGGGGGGGGC AGCGGGGTCC TCGGGACCCG 
GGTCGCGGGC GGCGGTCGGG GCAGAGGGAC 
CCGGGTGTCC CGGG AACAGC TCCCCCGTCA 
TGGCCCGCGC GAAGAAGGGG TCCGCGTCGG 
TAGCCACAAA CGGAAGCTCC TCGGTGGCCT 
CGGGGGGCTC CGGGGCTTCC CACAAGACCG 
CCAGGCACGG GGGCCCGTCG GCGAGAGGGC 
GCTGCGCCGC CAGACACGCG TTTTCGATCG 
CCCACGTCTC GATGTCGGAC GACACGACGT 
GCGAGTCGAA GAGCGTCAGG CACAGTTCCA 
TGCGGAGCGC CACCACGACG GGCGCGCCGA 
CCGTAACGCG CGCGGCGGGG GTGCGGTGGG 
CCGTGGGTCG GTAGAGGGCG TGGGGGGCCT 
GGCCGAGCGT CTGGCCAGAC TCCAGGCGTG 
CGGTGTAGTC GTCGGGAAAC ATGCAGGTCC 
ACATGCGCCC GAGGACGCTC ACCGCCGCCA 
CCGGGGCGTC CCGGCGCTGG GTCCCGAGCT 
CGGTTTCGGA CAGCTTGCCC CGGCGCCAGT 
GGGTCGGGGG GCCTCCGTCC AAAAACGTCG 
GGGTCAGGCG CTGGACGAAC AGCATGGACT 
TGAGGTGCAT GTACTCGTGC TGGCGAACGA 
CCGGAACGCC GGCCACCAGC GCGACCAGCA 

551 



GCTGGGGCAG 


GGGCGGCTGG 


bo z4 U 


GCTGGGGCAG 


GGGCGGCTGG 


b J JUU 


GCTGGGGCAG 


GGGCGGCTGG 




TCCGCGCGGC 


GGGTTTGGTC 


b j4zu 


r^r~*f**f** tv /"'/""'TV /■** 

CGGGCACGAC 


CCAACCGCAC . 




GGGGGCTGGT 


CTGGACGGCG 


63540 


CCGCTGCGGC 


GGAAACCGCG 


^ "2 C A A 


TCAACGAGGC 


CGAGAGCGAG 


63660 


CGGGGC CLLG 


GGGGGGTGTC 


63720 


GGTCGTCC CC 


GGGGGCGACC 


63780 


GGGCGGGAAA 


GGCCCCGCGA 


63840 


AAGCCGGGGG 


CGCGGACTCC 


63900 


CCGGGGCGAC 


CGGGGCGGGG 


63960 


GGGGCGCGCG 


CGATTTGGCA 


64020 


AGAAAGCGGC 


GGCAGAGCCG 


64080 


TCGGTGTGGG 


TGGCGAGGGC 


64140 


TTTGTTGAGG 


CGGGACACTC 


64200 


GCGGTGCGTA 


CTGGGACGTG 


64260 


GGGCCGGATT 


GGGCGTGGGG 


64320 


TCCAGGCCCA 


CATGCGAGGG 


64380 


GGGC G AGG AC 


CCCCGGGTCC 


64440 


ATCCGCCATC 


CCCCCCCGCA 


64500 


CTGCCTCGTC 


GGCG AGGGG G 


64560 


GGAGGGAGGC 


GTCGAAGGGC 


64620 


CGGCGCTCGC 


CGCGAGAACG 


64680 


CGCTGCCCAC 


AAACCGCACG 


64740 


CGACCGGGGT 


CATGGAGATG 


64800 


GCTCGGCGAT 


GAGCGCCGAC 


64860 


GGTTGAGATC 


GGTGTGGAGG 


64920 


CGCGCAGGGC 


GGCGTCCGGC 


64980 


Gl 1CCGACTC 


GCGGGAGAAG 


^ C A VI A 

65040 


GGAGCACCGC 


GGCCAGAACC 


C C 1 A A 

6 b 100 


TCGCGGCGGC 


CAGCACGGCC 


DD160 


CGGGGAGGGA 


CGCCTCGCGC 


bD^ZO 


PPPPO ?V PP 7V 

CGGC C AGGAG 


GGCGTCGAAG 


ODZoO 


ACAGCGCGGC 


CAAAGCGGCG 


65340 


GGGCCTGGGC 


CGGACTGAGC 


65400 


CCAAGGCCGA 


GCGCCAGGGC 


65460 


CGGCCAGCCG 


CGTGCCGAAC 


65520 


GCAACACGCG 


GATGCGGGCG 


65580 


CCGCTGCGTC 


CTCGAACGCG 


65640 


GGTCCAGGCG 


CCAGAAGTTG 


65700 


CGTCGTTCTC 


GTTGAAGGCG 


65760 
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ACGCAGTGGC GCTGGGACCC 
GCCCAGCCCA GCTGGGCCCA 
GCGACGTACA GCTCGGCCGC 
ACGAAGCGAC CGAACAGCTG 
5 CGGTTCACGA CGGTCAGCAC 
GTCGGAAGCG GGGGCCGCAC 
TCGGTGACGG CCGGGAAGCA 
CTGGTGGCGA ACGGCAGATC 
TCTCCGCCGG CGCGCAGATA 

10 TGGCCCTCGG GGGAAGAAGA 
AGGGGCGCGG GCGCGGCCGC 
CTATATAAGC CCATGCGGCG 
CACGGGGCCA GGTCCGCGGC 
AAGTTCATCT CCCAGGGCAC 

15 GTGTGCTTGG TGACGCGCGC 
GCGCCCAGGA CGCCCTGGTA 
CCCGACACCG TGTTGGTGGT 
CCCTCGGGGG ACTTCCAGGC 
CACTCCGCCT CGGCCTCCTC 

20 AAGAGCGCCC CCAGGCGGCC 
GCGCTTAGCG GGTGCGTCTC 
ACGTCGAGCT CGCGCGTTTT 
TCCGCCACCG AGCGCGCCTG 
TTCAGCATGG TCTTGAGGTT 

25 GCGGCCAGCG CCTCCCGCAG 
TTGGCGCGCA CCACTGCGTC 
TACTCGGCGT ACGCCGTGTT 
CACGCGACCA GCGCGTCCTC 
GCGGTGGCCT CCGGGTCATT 

30 AGGGGCACGC CGCCGAGCGC 
AGCTCCACGT AGTCGGCGTA 
ACGCTCGTCA TGTCGTCCGC 
CCCGCCTGGG CCATTTCCAG 
GCGTCGACCC GAAACATGTC 

35 CCGAGGCGGT GGATGGCGGC 
TAGGGGTTAA ACGCGAAGGC 
AGCGCCTGCT CGGCGCGCTT 
GCCTGCAGGC GGCGCAGCTC 
GCCGCCTCGA CGCCGGCGGC 

40 TTCGCCGTCA GGTCGGCGAC 
ATGACCTTGC CCAGCTCCTG 
TCGGCGTGCA GCAGGCCCCC 
GTCGTCGCGC GGGCCGCGGC 



CCGGGGGCCC GGCGGCGGAC 
GCGACACCCA AACTCGCGCG 
CGCGTCGATC GAGGCGCCCC 
AAAGTTGGCG GCCTGGGCGT 
GTACATGGCC GTGACCGTCG 
GCAGGCCGCC TCGGGACGCA 
TAGCGCGTAC TGCAGCGGCG 
CAGAGCGCTG ACGGCCTCAC 
CGCCTCGCCC CGGCGGCGCA 
GGCCCGGGCG CGGGCGTCGA 
CGCGCCCGCG CCCGTCTGGC 
TTGGATGAGT TCCCGCGCGC 
CGCCGCGTCG AACTCCGCCA 
CCTGCGCACC ACCTCATCCC 
GCCCAGCTCC TCCACGGCCT 
CCTGGCGGAA AGGCGCTCGT 
GTCCTGCAGG GCGCGCAGCT 
GCCCCCCCGG ACGCGGCCAA 
CAGGGACCTC CGCAGGGCGT 
GGCGTGCCGC GCCAGGGGGC 
GAAGGTGCGC TGGGCGTGCT 
CTCGGTCTGA TCCAACAGAA 
GTCGAGCGTC TTGGCCACGG 
GGCCAGGCCC TCGGCCTCGA 
GCCCGCCATG ACCCGCTCGG 
CTTGGTCTCG GCCGTGTCCT 
CTTCACGGGG CTCTGGTCCA 
GCTGGGACAC GGCAGGGTGA 
CCGGGCCGCG GATATCTGCT 
CCGGTGCACG TCGGCCCGGA 
GCCATGTTGG AAGAACGGCA 
CAGGCGCCCC ACGGCCTCGT 
GAGCCCCTCC GCGATGCGCA 
GGCGTAGGTT TCGGCGGCGG 
GAGCGGGGGG AGCATGGGGT 
CGTATCCAGG GCGAGGGTGA 
GCGGAAGTCC CGGGGGTTGT 
GACCACGTCG AACTCGGCGC 
CCAGCGCTCG CTGCTGCCCC 
GGCGGCCTCA AGTTCGTCGG 
CAGGGCGCGC CCGCTGGGGG 
GAACCCAGCC TCGTGCCCCG 
ATCGATGAGG GCGGCATGGT 
552 
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GCCCGGCGCG CCTGGACTAC CAGGTCGGCG 
ATGGCCCCCC GCGCCTCCAG GGCCAGCCGA 
GCCAGCGCCG CGAGGAAGGA CAGGGGCGAG 
GACACCGCGT CCGCCAGGGC GCCATGCGCC 
5 CTTGCCGTCG CGACGGCGGC GCTCCCGGCG 
GTGGGGGCGT GATCGGAAAA GAACTGCACG 
AGGGTCTTCA GCACCACCAC GAAGGCGGGA 
GGGGTCGGGT GTTCCAGGGC CTCCCGGTAC 
AGCGCCGCCG TGACTTCCGG GGGGGGGCCC 

10 GCGGGCAGGG AGGCCCGCAG GGTCGCCAGC 
TCCGGGAGGG GCCGCAGGAC CCCTTGGAGT 
GCCACCTTGG CGCGCTCCCG CGCGTCGTTG 
CGAAGCCGGG AGCGCGCCTC CGGAGCGAGC 
CTCGCCTGCC GCAGCGCGTC TTCGGCCATG 

15 TCGACGTACG GCGCGGGGCC GGTCGCCGGG 
GCGAGCGCCG CGTCGAGGGC GTCGAAGCGC 
GCCTGCTGGT CGTTGATGCC GTGGATGCTG 
ATGAGCCCCT GGGTCGCGGC GTCGGTCAGG 
GCATCTAGGG TCTGGCCCCG CTGGAGCAGG 

20 GCGAGGGGGG GCGGGGGGGG GAGCGCGGCG 
AAGGCCGGTA GCGATTCCAG CAACTGGACC 
AACCGACAGT CGTGGCTGTC GCTGGCCTGC 
TGGAAGTACT CCTTGATCGC GCTCTCGATC 
AGCCGCGCCT GGATGGCCTC GGGGCCCAGG 

25 CCCGGGGCGG CGGGCACGGG CATCACGGTC 
ACCCCGCGGG CGAGGGCGTC TAAGGCCTCG 
ATCTCTTCGC CCCGGGCAAA CTGGGCCAGC 
TGGGTCGGGG TGGCGGGGGC GAACAGGGTG 
CACTCTCCGA GGCGTGCGTA CAGATTGGCC 

30 TCCGCGAGGT CCCCGTAAAA GGCGTCCGTC 
AGCTTAGCGA GGGCCAGGCG CCCGATCTGC 
GGCCGGTGGG CGGCCACGTC CGCCAGGCTC 
GCCGTTTTGC GGGGCAGCAT GCGCAGGGTG 
ACCCCGGCCT GCGTATGCGT GCGGGCCCCG 

35 AAGAAGAAGA TGACGCAGAG CTCCAACAGC 
GCGTTGATGG TGAGCTGCGA ACACGCGGCC 
GCGAGCCGGA CCGCCGTGGC GGCCACATTG 
GCGCCGGGGG GCTCCGGGGG GCGGCGGGCC 
GGGCTCGCGG GCCCGTCATC GCCGCCTCCC 

40 GGAGGGACCG TGGCGGCTAT GGGCGTCGGG 
GCCTTCTTCT TGGGCGCGGA CTTCTTCTTG 
CTCTCGCCCG AGGTCAGATC CTCCACGCTG 
TTGGGCAAGC CGGTAGAATA GCGCGCCCGG 
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AGGACCCGCA GGTCCTCGGC TTCTTCGGCC 
GCGGCGTGCG GTGGACCCGA GGCCGCGGCG 
CCCTCCAGGG CTGCTGCCCA CACATCATCG 
GTGTTGGGTG GGCCCGAGGC CCCCCGGGGG 
5 TGGACGTGGG TGGGCGCGGG GAGCGCGGGG 
CTGGGGACCA CACCGACAAA GAGCGCCCCT 
GTGATGGCCA CGCGCCGCTC GACGAACGGT 
TAGAGGTGCA GGGCCGCGGC GGTCAGGTCC 
ACAAAAAACA CCATGGCGCC CGCCCACCGC 

10 AGGTACGGGT ACACGTCGCC CGCCCGCACC 
AGGCCGTGCG GGTCAAACAG ATAGGCCGTG 
GGGCCGATGG TCAGGAGCGT GTAGGACAGC 
GTGTGCGCGG GGCATTGCGT CTCCAGCAGC 
TCGCCGTACA CCCGCGAAAA CACGCAACGC 

15 AAGTTGGGGA GCTCGATAAT GGAACACATG 
GTCCACTCGC CCCCCTCCAC CAGACATCCC 
GGCCCCACGT CGAAAAGAAG ACTGAGAAAC 
CCCCCCGGCT CCAGATCGGT CGCGAACTGG 
TCCCCCTGGC GCTTCATCGT GGGGTGAGGT 

20 GCCACGAGCG GGGCCTGTTT ATGGGCCGGG 
CCGAACCCCC CCCGCCCATC AACCGCCTGT 
GTGTGTGGTT TCCCGGGAAG CCACATCCCA 
CCGCACTACG CCACCTTTCC ACCCCCCCCC 
. CAGATGGATG GGTGCGATAA TAAAGCTTTA 

25 GTGTACCGGT GGTGTCTCCT GCGGCGTCAT 
AGGGACCGTC TCGCGGCCCG CCGGGCGCGT 
GCCGGGTGTC GTGGGTTCGG GGGTGCTACC 
GGCCTCCGGG CCGGCGGAAG GCCGAAACGC 
CAGGAGCTCG TTTATTAATA GCCAGTCCAT 

30 CAGGTCCACG GAGTCCGGAA CCACCGTCGG 
CCAGGCCCCC AGGTCATGAC GGTTCGTGAG 
CGCGTCCTCG GTCGCGTGGG CCATCACCTC 
GCTGGCGAAG GGCGCCACGA CCAGCGCGCG 
GAGTTCCTGA ACGAACTCGG CCACCCGCTC 

35 GCCGGCCGAG AGGCGCCGCC AGCGCGCCAG 
CTGGTCCCCC GACATCAACT TTGACGCCCT 
CCCGTGGATT TCCCGCCGCA CGACGGCCAG 
CAGCTCGTCG CATACCCCGA GGTGCGCCGT 
GAGGGACGCG ACCAGCGCGC GCTTGGCGTC 

40 GTCGCCCATG GCCTCGGGGC GCCAGGGCCC 
GTACAGGCGG TGCCCGTCGC TCTCGAACCG 
GTGCAGCCGC AGCAGCACGA TCGCGTCCTC 
GGCGAGCTCC TGCAGCACCC CCCGGGCCGC 
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GCTGCCCACC TCGGGAGGCT GGGGGGGAGG CAGCTGGACC GCGGGCCGCA GCTGCTCGAC 73 560 

GGCCCCCCTG GCGATCACGT ACAGCTCGCG CAGCAGCTGC TCGATGTTGT CGGCCATCTG 73620 

CATCGTGGGC CCGACGCCGG CCCGGGTGGC CGGTTCGAGG AGGGTGATCA GCGCGCCCAA 73680 

TTTTGTGCGG TGCCCCTCGA CGGTGGGGAG ATAGCCCAGG CCGAAGTCGC GCGCCCAGGC 73740 

5 CAGCACCCGC AGGGCAAACT CGATGGGGCG GGGCAGGTAG GCAGCGTTGC ACGTGGCCCT 73800 

CAGCGCGTCC CCGACCACCA GGGCCAGCAC GTAAGGGACG AACCCCGGGT CGGCGAGGAC 73860 

GTTGGGGTGG ATGCCCTCCA GGGCCGGGAA GCGGATCTTG GTGGCCGCGG CCAGGTGAAC 73920 

CGAGGGGGCG TGGCTAGGCG GCCCGACGGG GAGCAGCGCG GACAGCGGCG TGGCCGGGGT 73980 

GGTGGGGGTC AGGTCCCAGT GGGTCTGGCC GTACACGTCG AGCCAGATGA GCGCCGTCTC 74040 

10 GCGCAGGAGG CTGGGCTGGC CGGCGCTGAA GCGGCGCTCG GCCGTCTCAA ACTCCCCCAC 74100 

GAGCGTGCGC CGCAGGCTCG CCAGGTGTTC CGTCGGCACG GCCGGGCCCA TGATGCGCGC 74160 

CAGCGTCTGG CTGAGGACGC CGCCCGACAG GCCGACCGCC TCACAGAGCC GCCCGTGCGT 74220 

GTGCTCGCTG GCGCCCTGGA TCCGCCGGAA CGTTTTCACG TAGCCGGCGT AGTGCCCGTA 74280 

CTCCCGCGCG AGCCCGAACA CGTTCGCCCC CGCAAGGGCA ATGCACCCAA AGAGCTGCTG 74340 

15 GATCTCGCTG AGCCCGTGGC CGGGGGGCGT CCGCGCGGGC ACCCCCGCCA CCAAAAACCC 74400 

CTCCAGGGCC GATATGTACT GGGTGCAGTG CGCGGGCGTG AACCCCGCGT CGGTAAGCGT 74460 

GTTGATCACC ACGGAGGGCG AGTTGCTGTT CTGGACCAAA GCCCACGTCT GCTGCAGCAG 74520 

CGCGAGGAGC CGTTGCTGGG CCCCGGCGGA GGGCGGCTCC CCTAGCTGCA GCAGGCCGGT 74580 

GACGGCCGGA CGGAAGATGG CCAGCGCCGA CGCACTCAGA AACGGCACGT CGGGGTCGAA 74640 

20 GACGGCCGCG TCCGTCCGCA CGCGCGCCAT CAGCGTCCCC GGGGGCGCGC ACGCCGACCG 74700 

CGGGCTGACG CGGCTTAGGG CGGTCGACAC GCGCACCTCC TCGCGACTGC GAACCATTTT 74760 

GGTGGCCTCG AGGGGCGGGA TCATGATAGC CGGGTCGATC TCCCGCACCG TGTGCTGAAA 74820 

CTGGGCCAGC AGCGGCGGCG GGACCACCGC GCCCCGATCG GGGGTCGTCA GGTACTCGTC 74880 

CACCAGCGCC AGCGTAAACA GGGCCCGCGT GAGGGGGGTC AGGGCGGCGT CGTCGATGCG 74940 

25 CTGTAGGTGC GCCGAGAACA GCGTCACCCA ATTGCTGACC AGGGCCAAGA ACCGGAGACC 75000 

CTCTTGCACG ATCGGGGACG GGAAGAGCAG GCTGTACGCC GGGGTGGTCA GGTTGGCGCC 75060 

GGGTTGCCCC AGGGGAACCG GGGACATCTT AAGCGACATC TCCCCGAGGG CCTCCAGGGA 75120 

GGTCCGGGGG TTCATGGCCA GGCAGCTCTG GGTGACGGTC CGCCAGCGGT CGATCCACTC 75180 

CACGGCACAC TGGCGGACGC GCACCGGCCC CAGGGCCGCC GTGGTGCGCA GCCCGGCGGC 75240 

30 CTCCAGCGCG TGGGTCGTGT CGGAGCCGGT GATCGCCAGG ACCGTGTCCT TGATGACGTC 75300 

CATCTCCCGG AAGGCCGCCT CGGGGGTCTC GGGGAGCGCC ACCGCCATGC GGTGCACCAG 75360 

CAGCCCGGGG AGGTTCTCGG CCAAGAGCGC CGTCTCCGGA AGCCCGTGGG CCCGGTGCAA 75420 

GGCGCACAGT TGCTCCAGGA GCGGGTGCCA GCACGCCCGC GCCTCCGCCG GGCCGACCGC 75480 

CGCGCCCGAC AACAGAAACG CCGCCGTGGC GGCGCGCAGT TTGGCCGCGG ACAGAAACGC 75540 

35 CGGCTCGTCC GCGCTGCCCG CCGGCTCGCT CGAGGGGGAG GGCGGCCGGC GGAGGTTGGT 75600 

CAGGCTCCCC AACAGGACCT GCAACGGTCC GTTTGGGGGT GGAGCGGACG GGGGGGTCAT 75660 

GCCGGCGGGC GCCGGGACCT GGAGCGCGCT GTCCGACATG GCGACCGGCG TGCGCGCTCG 75720 

GCGACGCGGC GCGGAGACCG CGGGCCCAAA CGGGAATGAC TGCCGCCGCC CTATACGGAG 75780 

GGGCTAAGTA TCGCCCGGGG ACCCTTCGAA ACCCCGGGCG TGTCGCAAGT ACGCCGCGAA 75840 

40 GGCGCGGCGT GTTATACGGC GCGTTATGTC CCGGCATTCC GTTCGTGGGT TCGGGCCCGG 75900 

GTGCTGTCGG GTGGGAGTGT GTGTGTGTGG GGGGGGGGCG GCGCGACGGC GGCCCGGACC 75960 

AAGTGTATCG CGGCCGTTCC GTGGGGCGGC CCAACAGGCC CTTTAAACAT TTGCGTATGC 76020 

ACCGGCCCAG CCAGTCGGAC ACCGGAACCC ACCAGAGGCG GAAGCCGCCT TCGCCCGTGA 76080 
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GGGTGCGTGT GTTTTCTGGT GGCGTGTTTT TCCTTTCCGC CCTCCTCCCT CCCCACCTCC 76140 

ACCACCCCCC CCCCACAACT CGCCCGTTGG CGATCGGCGG GAAAACCATG AAAACCAAGC 76200 

CACTCCCGAC AGCCCCGATG GCGTGGGCCG AGAGTGCCGT GGAAACCACC ACCAGCCCGC 76260 

GCGAGCTCGC GGGCCACGCC CCGCTCCGGC GCGTCCTGCG CCCGCCCATC GCTCGCCGCG 76320 

5 ACGGCCCGGT GCTTTTGGGG GACAGGGCCC CCAGGAGGAC GGCCAGTACG ATGTGGCTGC 76380 

TGGGGATCGA CCCCGCGGAG TCGTCTCCGG GAACGCGCGC TACCCGAGAC GATACCGAGC 76440 

AGGCCGTGGA CAAGATCCTC AGGGGAGCCC GGCGCGCGGG AGGGCTGACC GTCCCCGGCG 76500 

CCCCCCGCTA TCACCTGACC CGCCAGGTAA CCCTGACGGA TCTCTGCCAA CCAAACGCGG 76560 

AGCGGGCCGG GGCGCTCCTT TTGGCCCTGC GGCACCCCAC CGACCTCCCC CACCTGGCCC 76620 

10 GCCATCGGGC TCCGCCCGGC CGGCAGACCG AGCGACTGGC CGAGGCCTGG GGCCAGCTCC 76680 

TGGAGGCCTC CGCCCTGGGG TCCGGGCGGG CCGAGAGCGG CTGCGCGCGC GCGGGCCTTG 76740 

TGTCGTTTAA CTTTCTGGTG GCCGCGTGCG CCGCCGCCTA CGATGCGCGC GACGCCGCCG 76800 

AGGCGGTCCG GGCCCACATC ACGACCAACT ACGGCGGGAC GCGGGCCGGG GCGCGGCTGG 76860 

ACCGGTTTTC CGAATGCCTG CGCGCCATGG TCCACACGCA CGTGTTTCCC CACGAGGTCA 76920 

15 TGCGGTTTTT CGGGGGGCTA GTGTCGTGGG TCACACAGGA CGAGCTGGCT AGCGTCACCG 76980 

CCGTCTGCAG CGGACCCCAG GAGGCCACAC ACACCGGCCA CCCGGGCAGG CCCTGTTCGG 7704 0 

CCGTTACCAT CCCGGCCTGC GCCTTCGTGG ACCTGGACGC CGAGCTGTGC CTGGGGGGCC 77100 

CTGGGGCGGC GTTCCTGTAC TTGGTCTTCA CCTACCGACA GTGCCGGGAC CAGGAGCTCT 77160 

GTTGCGTGTA CGTGGTCAAG AGCCAGCTCC CCCCGCGCGG ACTGGAGGCG GCCCTCGAGC 77220 

20 GGCTGTTCGG GCGCCTCCGG ATAACCAACA CGATTCACGG GGCCGAGGAC ATGACGCCCC 77280 

CTCCCCCGAA CCGAAACGTT GACTTTCCGC TCGCCGTCCT GGCCGCGAGC TCGCAATCCC 77340 

CGCGGTGCTC GGCGAGCCAA GTCACGAACC CCCAGTTTGT CGACAGGCTG TACCGCTGGC 77400 

AGCCGGATCT GCGGGGGCGC CCTACCGCAC GCACCTGCAC ATACGCCGCC TTCGCAGAGC 77460 

TGGGTGTCAT GCCAGACAAC AGCCCCCGCT GTCTGCACCG CACCGAGCGG TTTGGGGCGG 77520 

25 TCGGCGTTCC GGTTGTCATC CTGGAGGGCG TGGTGTGGCG CCCCGGCGGG TGGCGGGCCT 77580 

GCGCGTGATC GTCTATTGAC GACGGCCGCC CAACCCGAGC GACCTTCCCC TCCCACTTTC 77640 

CCCCCCCCCC CTCCTACACA CCAACTCCGC CCTCGCCGTC TTGGCCGTGC GCGGCCCCGT 77700 

GCGTCCGTCT CAATAAAGCC AGGTTAAATC CGTGACGTGG TGTGTTTGGC GTGTGTCTCT 77760 

GAAATGGCGG AAACCGACAT GCAAATGGGA TTCATGGACA CGTTACACCC CCCTGACTCA 77820 

30 GGAGATAGGC ATATCCTCCT TAGATTGACT CAGCACACGA TCGCACCCCA CCCCTGTGTG 77880 

CCGGGGATAA AAGCCAACGC GGGCGGTCTG GGTTACCACA ACAGGTGGGT GCTTCGGGGA 77940 

CTTGACGGTC GCCACTCTCC TGCGAGCCCT CACGTCTTCG CCCACCGATT CCTGTTGCGT 78000 

TCCTGTCGGC CGGTGCTGTC CTGTCGACAG ATTGTTGGCG ACTGCCCGGG TGATTCGTCG 78060 

GCCGGTGCGT CCTTTCGGTC GTACCGCCCA CCCCGCCTCC CACGGGCCCG CCGCTGTTTC 78120 

35 CGTTCATCGC GTCCGAGCCA CCGTCACCTT GGTTCCAATG GCCAACCGCC CTGCCGCATC 78180 

CGCCCTCGCC GGAGCGCGGT CTCCGTCCGA ACGACAGGAA CCCCGGGAGC CCGAGGTCGC 78240 

CCCCCCTGGC GGCGACCACG TGTTTTGCAG GAAAGTCAGC GGCGTGATGG TGCTTTCCAG 78300 

CGATCCCCCC GGCCCCGCGG CCTACCGCAT TAGCGACAGC AGCTTTGTTC AATGCGGCTC 78360 

CAACTGCAGT ATGATAATCG ACGGAGACGT GGCGCGCGGT CATTTGCGTG ACCTCGAGGG 78420 

40 CGCTACGTCC ACCGGCGCCT TCGTCGCGAT CTCAAACGTC GCAGCCGGCG GGGATGGCCG 78480 

AACCGCCGTC GTGGCGCTCG GCGGAACCTC GGGCCCGTCC GCGACTACAT CCGTGGGGAC 78540 

CCAGACGTCC GGGGAGTTCC TCCACGGGAA CCCAAGGACC CCCGAACCCC AAGGACCCCA 78600 

GGCTGTCCCC CCGCCCCCTC CTCCCCCCTT TCCATGGGGC CACGAGTGCT GCGCCCGTCG 78660 

556 
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CGATGCCAGG GGCGGCGCCG AGAAGGACGT 
GTCGTCCGAC TCCGAAACGG AGGACTCGGA 
GGAGACGCTG TCTCGATCCT CTTCG^TCTG 
CGACTCCGAC TCGCGGTCGG ACGACTCCGT 
5 GAGCGACGGC CCTGCCCCCG TGGCCTTTCC 
AAACCCCGGC CTGGGCGCCG GCACCGGGCC 
CGACTCCGAT TCCGCGGCCC AGGCCGCCGC 
CAGCCAGCCC ACTGTGGGAA CGGACCCCGG 
GAACGCGGAG GCGGTGGCGC GGTTTCTGGG 

10 GCTGGAGTAC TTCTGTCGGT GCGCCCGCGA 
CGGCAGCGCC CCCCGCCTCA CGGAGGACGA 
GATGCGACGC CTGTGCCTGG ACCTTCCCCC 
TCTGAGGGAG TATGCGACGC GGCTGGTTAA 
CCGCCTGTAT CGCATCCTGG GGATTCTGGT 

15 CTTTGAGGAA TGGATGCGCT CCAAGGAGGT 
TCGCGAACAC GAGGCCCAGC TAATGATCCT 
GATCCACAGC ACCCCGAACA CGCTCGTCGA 
AGAGTTTTAC CTCAAGCGCT TCGGCGGGCA 
CCGCATCGCC GGGTTCCTGG CGTGCCGGGC 

20 GCGACAGGGG TCGTGGTGGG AAATGTTCAA 
GATCGTGCCG TCCACCCCCG CCATGCTGAA 
CTGCTACCTG GTAAACCCCC AGGCCACCAC 
CAACGTGAGC GCCATCCTCG CCCGCAACGG 
CGACGCCAGC CCCGGCACCG CCAGCATCAT 

25 GGCGGCGCAC AACAAACAGA GCACGCGCCC 
GCACAGCGAC GTTCGGGCCG TGCTCAGAAT 
GCGCTGCGAC AACATCTTCA GCGCCCTCTG 
CCGCCACCTC GACGGCGAGA AAAACGTCAC 
GTCGCTCGCC GACTTTCACG GCGAGGAGTT 

30 GGGGTTCGGC GAAACGATCC CCATCCAGGA 
CACCACCGGA AGCCCCTTCA TCATGTTTAA 
CACGCAAGGG GCGGCCATTG CCGGCTCCAA 
CAAACGCTCC AGCGGGGTCT GCAACCTGGG 
GCGGACGTTC GATTTTGGCA TGCTCCGCGA 

35 TATCATGATA GACAGCACGC TGCAGCCGAC 
GCGGTCCATG GGCATTGGCA TGCAGGGCCT 
TCTGGAGTCG GCCGAGTTCC GGGACCTGAA 
GGCCATGAAG ACCAGTAACG CGCTGTGCGT 
GCGCAGCATG TACCGGGCCG GCCGCTTTCA 

40 GTACGAGGGC GAGTGGGAGA TGCTACGCCA 
CCAGTTCATC GCGCTCATGC CCACCGCCGC 
CTTTGCCCCC CTGTTCACCA ACCTGTTCAG 
CCCCAACACG CTCTTGCTGA AGGAACTCGA 
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CGCGATGGAC GGGCTCGAGG CCAAGCAGTG 
CCCCGCCCAC CCCCTCCGGC GGTTCAAGAC 
CGACCTGTGT GCAGACCGCG CCCCCTATGT 
CACAGAGAAG GCGGACGGGA CGCTCCCCGC 
5 ATATAAGCGC GGCCTGAAGA CGGGGATGTA 
CGGGGTGTTC GCCGGCGACG ACAACATCGT 
GCTCCGATCG GGGTCAGGCG TCGCTCTCGG 
CCCCCGCGAG CACCGACCCC CTAGATACCC 
CGGTGTGCCC CACCCCCGAG CGGTACTTCT 

10 TTCGCTCCCT CAGCATCCTG AACCGCTGGC 
AGGAGGACGT CTCCAAGCTC TCCGAGGGCG 
TCCTGTCGGC CGCGGACGAC CTGGTGACGG 
AACAGAAGGA CATTCTTCAC TACTACGTGG 
GCGTCTACAA CATCATCCAG CTGGTGCTCT 

15 ATGTGGCCCG CACCATCAAC CACCCGGCCA 
GGGTGCGGGA ATGCGACTCG ATCCCGGAGA 
TCTTTTTTGC CGCCTCGTTC GCCGCCATCG 
TCACCTGCCA GTCGAACGAC CTCATCAGCC 
GCTACATCTA CAACAACTAC CTCGGGGGCC 

20 GGCTGTTTCG GGAGGCGGTG GATATCGAGA 
ACAGCTCTAT CCTGAGTCCG GGGGCCCTGG 
CGGATCGCCT GCTGGGCCTG ATCCATATGC 
CCAGCTTTCC CCTCAGCCTC ATGTCCACCG 
GCACCTCGTA CGCCGGGGCC GTCGTCAACG 

25 ATGTCTAACC GAAATAAAGG GGTCGAAACG 
CAGGGGAGGG GGGTGGCGGC TGGGGAAAGG 
CAAAAGGGAA ACGCGTCCAA CCGATAAATC 
TAACAAACGA TTTTATTACT CTTATTATTA 
GCGCGTTTCC TCCGTTCCGG CTACTCGTCC 

30 GCGGGCGGGG GCGCGTGGGC CCACAGCTGC 
GTGATATGCC GAGTCACGAT GGAGCGCGCT 
GGCAGTCTTT TTAGAAGAGT CCAGGGTCCC 
TACTTGACGT ATCTGTGCTC CACCAGCTCG 
AGGGCCTCCG GGGCGTCGTG GATGACGTGG 

35 TCCGCGACCC GCTGCGCGTT GGGGACCTGC 
GAGGACGACC GGGCGCCGTC GCGCGGCCCA 
TCTTCTTCGT AGTCGTCCTC GCCCGCGATC 
CGCGTCTCGA GGCCGACCGG GGCCGCGGTC 
TTGGCGCGCT CCCGCCGGGC CGCCCGGCGG 

40 CACTCGCGCA GCACGTCCTC GACGGACGCG 
CAGCGGACGA ACAGCGCCAG GAACTGCGGG 
CGGCAGTGAA TCGTCGGAAT GTAGCCGGTG 
AGCAGGAGAT CGGTATCCGT GGTATGCACG 
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GCGCAGGCGT CGTCGGCCTC CAGCTGACCC 
AGAACGCGGA TACAGAACAG GTGAGCCAGG 
GGGGCCGCCG GGCCTGGGCC GGCGGCCCGC 
CCGCGGCGGC GCATGTTGGA AAAAGGCGAA 
5 CGGCGGCGAG GCGTCTACGT CACTGGCCTC 
GGCCAGGATC GCCTTGGCCC CGAACACAAC 
CACGAAGATG GGGAACAGGG ACTTTTGGGT 
TAGCGTGATT GCCTCGCGGT CGTAACTTGG 
ATACATGACA TTCCACAGGT CCACGGCGAT 

10 GCCCCGGCGC TTCACCAGAT GGTGAGTCTG 
TCCGGCACGA TTGTAGGTGC GGATAGGTCT 
GGGACACCCA AGCCCGCCGC CCCTGTGTAC 
AGACGCTATC CCGGGAAAGG CACGCTCTTT 
GGATTGGTGC AACCGCCGGC GCGCGCCGGT 

15 CCCCCTCCCC CGAGCCCTCA AAGAGGGTGT 
GACTAGGGCG GCGGGTCCGC CGTAGTCCTT 
GTCCCCCGGC CCCCCTAACC CCCATCCGGT 
CAAACGGCCG CGCCTCCGGG CCCGGTGACA 
GGAGCGTCGC GGCATGGCTC ATCTTCCCGG 

20 CGCGATCCCG TCGCCGCGCG AGCGGACGGA 
GGGCGCCGAG CTGAACGGGA TCCTGCAGGC 
CTCGCTCCTG GTCGTGGGCG ACCGAGGCAT 
GGTGTTTCTG CCCCTCGACC ATTCGCAGTT 
GGCGTTCCTG TCTCTCGTGG ACCAGAAGCG 

25 GTACCCTGAC CTGCGGCGGG TGGAGCTGAC 
GGTGCAGCGC ATATGGACGA CCGCGTCCGA 
GCTCATGAAA CGCGAGTTGA CGAGCTTCGC 
CCAGCTGCGC CTCACGAAGC CCCAGCTCAC 
CGCCAAACCC ACCACGTTCG AGCTCGGCCC 

30 CACCTGCGTC ACCTTTGCCG CCCGCGAGGA 
CCAGATTCTG ACCAGCGCGC TGAAGAAGGC 
CTACGGGGAA AACACACACC GCACATTCTC 
GGTCCTCCGG CGGCTCCAGG TCGGCGGGGG 
CCCCAGCGTG TGTGTCACCG CCACCGGCCC 

35 ACCCCAGCGG GTCTGCCTGA ACTGGCTCGG 
GGCGTCCCAG GACTCTCGGG CCGGCCCGAC 
CGCGGGCGAC CGCGGCGCCC CAGAAGAAGA 
CGCGTTCCCG GAACCGCCGG GAACCAAGCG 
GGACGACGCC ACCAAGCGCC CGAAGACGGG 

40 GCCCCCCCTC TCCGCGAGAT ACGGACCCGA 
CTACGCGTGC TACTTTCGCG ACCTCCAGAC 
CTTCCGGGGT CCCCAAAGAC CCCCATACGG 
GGCCGAACGC TTCACCGCGC CCGGGCACGC 
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TGGGGTAGTG TGTCCCCCCC CTCCAACCAA 
CCTCCGTCCT TTCCACCCCC CTTCCCCCTC 
GGGCCCGGGG CCCTTCGCAG CTTCACCGAG 
TCGCGGGGAC AGCCCCGGGG TCGCGGGCGG 
5 TGGGGACGAC GGGCGCCCCC GCCTCGCCTG 
TCTCTGGCTC CAGGCCACCA CGCTGGGCTT 
GTATGCGGAC GCCATGTCGG GGGCGTTCGT 
CGCCCCCCCC GCGTTCGCCC GGCCGCCGAC 
CGGGGGAGCG GCCGTGGCCC TGTGGAGCCT 

10 GGGCCCGGCG ACCCAGTGCC TGGCGCTCGG 
CGACGACGTC CATCCCCTTT TCCTCCTCGC 
GGTTGTCGTC GGCGGGCTGA CGATAGGCGG 
CGCCGCCGCG GCCCTGACGG CGGCGGTGGT 
CAGCTTTTCC AAGGCCTGTC CCCGCCACCG 

15 TCCCCCGCCC CGATACGCCC CGGAGGACGC 
ACCGTCGACG CACCACCAGC GATCTCCGCG 
AAACATCTGG GTTCCCGTGG TGACCTTTGC 
GCGAGGGTCT GACGCGGCTC CGTCAGGCCC 
CGGGGGCCAC GCGGCGGCGG GCCTGACGGA 

20 CACGGACCCG CTGCTGTTTG CGTACGTCGG 
TGTGGTCCCC GACATCGCCG TATACGCGAT 
GCAGGTGCTT GGGCTCCGGC GCCGCCTTCA 
CGCGACCCTG CGGGGCCTCT TTTTCTCCGT 
GGTGCGGCCG CGGATGGCGG CGAGCCGGCG 

25 CACGAGTTCC CCGAATACCA CCGGCGTGTG 
GGGGAGGGGG GAAGGAAATG GGGGCGGGGG 
CAGGCGGGCC CATCACTGTT AGGGTGTTAG 
CGTGTTGTAG TTGTCCGCGG GAGGCGGTGG 
GCGCCCACCG GTCCTTCGCG GGGGCCGGGG 

30 GCCTAGCCGT GGGCCTGTGG GGCCTGCTGT 
CCTCCCCCGG ACGCACGATA ACGGTGGGCC 
CCGCGTCCCC GCGGAACGCA TCCGCCCCCC 
AGGCGACGAA AAGTAAGGCC TCCACCGCCA 
CGAAGACATC CTCGGAGCCC GTGCGATGCA 

35 CGCGGGTGCA AATCCGATGC CGGTTTCCCA 
TCTGGCGTTA TGCCACGGCG ACGGACGCCG 
TGATGGTAAA CGTGTCGGCC CCGCCCGGGG 
GAACGGACCC GCACGTGATC TGGGCGGAGG 
ACTCGGTCGT CGGGCCGGTG GGTCGGCAGC 

40 CCCAGGGCAT GTACTACTGG GTGTGGGGCC 
GGGTGCGCGT TCGCGTGTTC CGCCCTCCGT 
AGGGCCAGCC GTTTAAGGCG ACGTGCACGG 
AGTTCGTCTG GTTCGAGGAC GGTCGCCGGG 
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TATGGCTGTC 


GTGTGTGGTT 


CCGfGGTTGCG 


86460 


CTTTTTTGTT 


TTGCGTGCGC 


TTATAAGAGC 


86520 


AGCGCCGTCG 


GGCCCCGGGT 


GCGGGATGTG 


86580 


GAGCGGCGAA 


CACTGCCTCG 


GAGGGGATGA 


86640 


CGTGGGTGCC 


ATCGCTCGGG 


GGTTCGCGCA 


86700 


CGTGGGGTCT 


GTCGTTCTGT 


CGCGCGGCCC 


86760 


GATCGGGAGC 


ACCGGCCTGG 


GGTTCCTCCG 


86820 


GCGTGTGTGC 


GCGTGGCTGA 


GGCTGGTCGG 


86880 


CGGGGAGGCC 


GGCGCGCCTC 


CGGGGGTTCC 


86940 


GGCCGCCTAC 


GCGGCGCTGC 


TGGTGCTGGC 


87000 


CCCGCGGCCC 


CTGTTTGTCG 


GCACCCTGGG 


87060 


CAGTGCGCGC 


TACTGGTGGA 


TCGACCCCCG 


87120 


GGCGGGCCTC 


GGGACAACCG 


CCGCCGGGGA 


87180 


CCGCTTTTGC 


GTCGTCTCCG 


CGGTCGAGTC 


87240 


CGAGCGGCCA 


ACAGACCACG 


GACCCCTGTT 


87300 


GGTCTGCGGC 


GACGGGGCCG 


CACGGCCCGA 


87360 


GGGCGCGCTC 


GCGCTGGCCG 


CCTGCGCCGC 


87420 


GGTCCTGCCG 


CTGTGGCCCC 


AGGTGTTTGT 


87480 


GCTGTGTCAG 


ACCCTCGCGC 


CCCGGGACCT 


87540 


ATTCCAGGTC 


GTGAACCACG 


GGCTGATGTT 


87600 


GCTGGGGGGC 


GCCGTGTGGA 


TCTCGCTGAC 


87660 


CAAGGACCCA 


GACGCCGGGC 


CCTGGGCGGC 


87720 


CTACGCATTG 


GGGTTTGCGG 


CGGGGGTGCT 


87780 


GTCGGGGTGA 


TCGCCATTTC 


AAATAAAAGG 


87840 


ATGATTTCGC 


CCTACCGCTC 


CGATCCCCGG 


87900 


TGCCGTGGAC 


GGGTATAAAG 


GCCAGGGGGG 


87960 


GTTGGGAGGT 


GGCACAAAAA 


GCGACACACC 


88020 


TTTCCGGCAA 


CCCTCCTCGC 


TGCGCCGGGC 


88080 


CTCTTCTGGT 


CATGGCCCTT 


GGACGGGTGG 


88140 


GGGTGGGTGT 


GGTCGTGGTG 


CTGGCCAATG 


88200 


CGCGGGGGAA 


CGCGAGCAAT 


GCCGCCCCCT 


88260 


GAACCACACC 


CACGCCCCCC 


CAACCCCGCA 


88320 


AACCGGCCCC 


GCCCCCCAAG 


ACCGGGCCCC 


88380 


ACCGCCACGA 


CCCGCTGGCC 


CGGTACGGCT 


88440 


ACTCCACCCG 


CACGGAGTCC 


CGCCTCCAGA 


88500 


AGATCGGAAC 


GGCGCCTAGC 


TTAGAGGAGG 


88560 


GCCAACTGGT 


GTATGACAGC 


GCCCCCAACC 


88620 


GCGCCGGCCC 


GGGCGCCAGC 


CCGCGGCTGT 


88680 


GGCTCATCAT 


CGAAGAGCTG 


ACCCTGGAGA 


88740 


GGACGGACCG 


CCCGTCCGCG 


TACGGGACCT 


88800 


CGCTGACCAT 


CCACCCCCAC 


GCGGTGCTGG 


88860 


CCGCCACCTA 


CTACCCGGGC 


AACCGCGCGG 


88920 


TATTCGATCC 


GGCCCAGATA 


CACACGCAGA 


88980 
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CGCAGGAGAA CCCCGACGGC TTTTCCACCG 
GCCAGGGCCC CCCGCGCACC TTCACCTGCC 
TCTCTCGGCG CAACGCCAGC GGCACGGCAT 
AGTTTACGGG CGACCATGCG GTCTGCACGG 
5 CCTGGTTCCT GGGGGACGAC TCCTCGCCGG 
CGTGCGGGCG CCCCGGCACC GCCACGATCC 
CCGAGTACAT CTGCCGGCTG GCGGGATACC 
GCAGCCACCA GCCCCCGCCG CGGGACCCCA 
GGGCGGGGAT CGGAGTGGCT GTCCTTGTCG 

10 ACCTCACCCA CGCCTCCTCG GTGCGCTATC 
GGCCGCCGGT TGTCTTCTTT TCCACCCCTT 
ACCCCCCCGC CGTCCCCCGG GCGTTATAAG 
CCTCGGCCCG ATCCGAACGG CGCACGCCGC 
CCCCGCCCCG ATATTCAAGC CCGCGGTGGT 

15 TACCAGCCCC TCGCCCCCGC GGCCTCCCCG 
ATCGGCGTCG GAGCGATCGT CGGGGCCTTT 
CCTCGGTCCT CGTGGGGACT CTCGCCGTGC 
TGCGTCGCGT GGGACCCCAC CCCCGTCGAG 
CCGGCCACCC TTATCCCCCG TGCGGCCGCC 

20 GCGGAGAGAT CGTCGGGTTA CTGGTGGGTG 
CTCGTCGACA GCGTCAGTGG CATCGACGAG 
TACTACCCAC GAAGCCCCGG CGGGTTTGTC 
GGGTTGCCGT GAGGCGCGCG TCCGACGGTC 
CACCCCACCC ACCGACCAAC GACGGCGTTT 

25 TCCCCCCCCA AAAAAAAAAA CAATAAACAG 
CGCTGTTTTT TTTTCTCTGT TTGTTACTTT 
CCGGAAACCG AGACGGTGGG GCCGGCGGTC 
CGTTTGAGCT TCGTCAACAG GGCGCTGAGG 
AGCGCGTTGG TCCGGGGGCG GGCGGGCATG 

30 CGTGTGGCCC CCGGAGGGGA GAAGAGGGCA 
GCCTCGATGT ACGGGGAGTC CGGGGCGTCT 
CGGCGAAGGC AGATGTTTTC GTATACCCGA 
CCATCCTCGC TCACCGACTC GTAAATGGAA 
TGGCTTTCGG CCGGCCAGGC GGCGGCGGTG 

35 ACGCCCGCGG GCATGGCGGC GTCATCGTCG 
GGTTCGGCCT CCGCGTCTGG CCCCCAGGTC 
CGCTGAGCCG CGAGCGGGCG CGCCGCGGCT 
CGGGGCGGGA GGCGCGGGGG CGCCCCGGCC 
GCGCAGGGCT CGGGACCCCG GTCGGCCGCG 

40 CGGTGGCGCC TGAACCTCCG AGGGGCCGCG 
GGCGGGGGTG CGTTATCGCG CCGGGTCCGT 
GCGCCGCGGC CCCCCGGTGG GCCGGACGCC 
GACGGGGTTC CGCTCCGAAG CAGGTCCGGG 



TCTCCACCGT 


GACCTCCGCG 


GCCGTCGGCG 


89040 


AGCTGACGTG 


GCACCGCGAC 


TCCGTGTCGT 


89100 


CGGTGCTGCC 


GCGGCCAACC 


ATTACCATGG 


89160 


CCGGCTGTGT 


GCCCGAGGGG 


GTGACGTTTG 


89220 


CGGAGAAGGT 


GGCCGTCGCG 


TCCCAGACAT 


89280 


GCTCCACCCT 


GCCGGTCTCG 


TACGAGCAGA 


89340 


CGGACGGAAT 


TCCGGTCCTA 


GAGCACCACG 


89400 


CCGAGCGGCA 


GGTGATCCGG 


GCGGTGGAGG 


89460 


CGGTGGTTCT 


GGCCGGGACC 


GCGGTAGTGT 


89520 


GTCGGCTGCG 


GTAACTCCGG 


GGCCGGGCCC 


89580 


CCGTCCCCCG 


TACCCACCAC 


ACCCCACCCC 


89640 


CCGCCGCACT 


CGCTTTTCCC 


ACCGGAAAAT 


89700 


GTGGGCTCCA 


AACGCCTCCG 


GAAGAGAGCG 


89760 


GCTATGGCTT 


TCCGTGCTTC 


GGGACCCGCC 


89820 


GCGCGGGCTC 


GTGTTCCGGC 


CGTGGCCTGG 


89880 


GCGCTCGTCG 


CCGCGTTGGT 


TCTCGTACCC 


89940 


GACAGCGGCT 


GGCAGGAATT 


CAACGCGGGA 


90000 


CACGAGCAGG 


CGGTCGGCGG 


CTGCAGCGCG 


90060 


AAGCACCTGG 


CCGCTCTGAC 


ACGCGTCCAG 


90120 


AACGGAGACG 


GCATCCGGAC 


CTGTCTGAGA 


90180 


TTTTGCGAGG 


AGCTCGCGAT 


CCGCATATGC 


90240 


CGCTTCGTAA 


CTTCGATACG 


TAACGCCCTG 


90300 


CCGCTTCTCG 


CCTCTCTTCT 


TCCCCCTCCC 


90360 


GGCCAATACC 


CTCCTTTTTT 


CTTTTTCTCT 


90420 


CTAATTGCGT 


ACGACAAACC 


ATGCGGAACT 


90480 


TTATTGAAAA 


CAGACATACG 


GGGAAAGGGG 


90540 


GCATTTTTTT 


AATGGCTCTG 


GTGTCGGCCG 


90600 


GCGGCGACGT 


TTGTCGGGCC 


GTCGTTGGCC 


90660 


GGCGACAGGC 


TTAGTCCCGG 


GTCCGGGGCG 


90720 


GACCCGCCCC 


AGTCGTACAG 


GGGATTTTCC 


90780 


CCCGGCGGGG 


CCGCCCCGCC 


GGCGTCTTGC 


90840 


ACCCAGGGGA 


TCTCCTCGTA 


GACGCGCCCC 


90900 


TCTGCGTCCT 


CGGAGGGGGC 


GCGGGGGGCG 


90960 


GTGTCGGCGG 


CGGGGGTGGC 


GCCAAGCCCG 


91020 


GGCAGCAGAT 


ACGTGTTTTC 


CATCTGGTCC 


91080 


CGCACCGCGT 


CGTAAACCCC 


GGCGGCCTCG 


91140 


GCCGGCCGCT 


GCTCGGGGGG 


CGCGGGGTTG 


91200 


ATATGCGTGT 


AATACGTGGC 


CGGCCGGCCG 


91260 


TCGACGTGCG 


GGGGCTCGGG 


GAGGTCCTCG 


91320 


GGGGTCGAGT 


GGGGGCGAGC 


CCGGGGGAGC 


91380 


TGTATCTTGT 


CCCGGCAGCT 


CCCGCCGACC 


91440 


GCGAGGCGCA 


GGATGGACTC 


GTAGTGGGGC 


91500 


GCCAGGGCGG 


CCCCGAACCA 


GGACTTGATG 


91560 
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CTGAGTTCCA TCCGGGCCCA GCTCGGGGCG GTCATCGTGG GGAACAGGGG GGCGGCGGTC 91620 

CTGCAGAAGC GCTCCTGGCT GTCCACCGCC GCCCGTAGGT ACTCGTTGTT CAGGCTGTCG 91680 

GAGGCCCAGA CGACATACCC GGTAAGCGTC GCGTTAATTA TATACTGGGC GTGGTGGTGG 91740 

ACTATGGATA GAACCTCGAC GGTCGAGACG ATGGCGTCCA CGATCCCGTA CGTGCCGCCG 91800 

5 CTGCGCTTGC CGGTCTCCCA CAGGTGGGCC AGGCGCGTCA GGTGGCCCAG GACGTCGCTG 91860 

ACCGCCGCCC GCAGGGCCAT GCACTGCATC GAGCCCGTGG TGCCGCTGGG CCCGCGGTCC 91920 

AGGTGGCGCG CAAACGTCTC CGCGGGCGCC TCCAGACTCC CGCTGAGCGC CACGAACCGG 91980 

CGATCGGCGG GGCCCAGGCG GCGACACACG TACTTGTCCG CCGTCCACAG CATCCACGAG 92040 

GCCCAATGGT ACAACACGGA GACGTAGGCC AGGAACTCGC TCAGCCGCAG TGCGGTGTCC 92100 

10 GTGCTCGGCC GGCTCGGGTC TGCGGGGCGC ATAAAGAACA TGTACTGCTG GAGCCTGTGG 92160 

GCCGCGTCGC GCAACCCCGC CACCGCGGCG GCGTACTTGG CCGCGGCGGC CCCGCTCTTG 92220 

AACGGGGCGC GCACCAGCAG CTTCGGGAGC AGGGTGGGCC GCAGCAGCAC GTGCAGGCTG 92280 

GGGTCGCAGT CGCCCGCCGG GTCGTCGGGG ATGTCCAGGC CGCTGGGCAC GACCGTCTGG 92340 

AGGTACTTCC AGTACTGCGC TAGGATGGCG CGGCTCAGCT GGCCGCCCGA CAGCTCCACC 92400 

15 TCGCCGAGCG CCTGCTTGGC GGCCGACGCG TAGTGCCGGA TGTAGTCGTA GTGCGGGTCG 92460 

CTGGCGAGCC CGTCTACGAT CAGGCTCTCG GGGACGGTGT TATGGTGCCG CGCCGCCAGC 92520 

CGGACGCTGC GATCGGCGCC GGTCAGAAAC GCCGGCTGCA GGTCGTCGGC GCGCTGCCGC 92580 

AGGACGCCCA CGGCCGCGCT GAGGAGCCCC TCCGGGGTGG GGAGCAGACA CCCGGCGAAG 92640 

ATGCGCCGCT CGGGGACGCC CGCGTTGGCG CCGCGGATGA GGTTGGCCGG CGTCAGGCAC 92700 

20 CGCGCCAGCC GCAGGGAGCT CGCGCCGCGC GCCCGGCGTT GCATGGCGGA GACCGTTCGG 92760 

TCGGGGGCCC GCCGGTCGGA GGTATGCCGC GTCCCGGGAT ATAGGGTTGC TTTTTATGGG 92820 

GAGGCGCCTA TGGGCGTGGC GGGCCGCCCA GCCCGGTCGC GCGCCTCCCG GACACGTGCG 92880 

CCCGGAGGGC GGCGGTCTCC TCGTCGCCCA TGAGCAGTTT CCGAAACTGC GCCATGATGT 92940 

CCACGACGCG GACCCGCGGC CCCAGCACGG ACTCGCTATT CAGGGGGGCG GGGGGGAAGG 93000 

25 CCGCCAGGTC TTCGAGCAGG AAGGCGGGGT CTGCCGTCCC GCTCACGGGC GCCCGGGGCG 93060 

CCGAGGACGC GGGGCGAAGG TCCACGTGTT CCGCGGCGGC GCGCACGTCC GCCCAAAATT 93120 

TGGCGGGGGT GGTCCGCGCG TACAGGGGCT GGGTCGCGCG GAGGACGCAC GCGTAGCGCA 93180 

GGGGGGTG^A -CGTGCCCACC TCGGGGGCCG TCGACCCGCC GTCAAACGCG GCCAGGGCCA 93240 

CGCACGCGAC CACCGTGTCG GCCAGGCCCA GCAGCCGCTG CAGGATGAGC CCCGTCGCCA 93300 

30 GCACGGCGCG CGCGGCCGCC GCGTGGTCCC TGCGCCGGCG CGCGTCCCCG CAGGCCAGGG 93360 

CGTATTTCAG GGTAACGGTC GCCAGGGCCG TGTGCAGCGC GTACACGGCC GCGCCCAGCA 93420 

CGGCGTTCAG CCCGCTGGTG GCGAGCAGGC GGCGCGCCGC GGTGTCGCCC AGCGCCTCGT 93480 

GCTCGGCCGC CACGACCCCG GGGCTACCCA GGGGCAGGGC GCGAAACAGC GCCTCCTGCT 93540 

CCACGTCCAC AAACGCGGGG TGGGCGGAGT GCGGGTGCAG GCGCGCCCCC ACGACCACCG 93600 

35 AGAGCCACTG GACCGTCTGC TCCGCCAGGA CCGCCAGCAC GTCCAGGACG CGCCCCGCAA 93 660 

ACGCGGCCTC CCGCGGGAGC ACGCATTTGA CGGCGCCGGG GTTGAAGCGG GCGAGCAGAG 93720 

CCCCGGTGGC GATGTACGTC ATGCGCCCCG CGTAGCGGGC GGCCACGCGA CAGTCGCGCC 93780 

CCAGGAGCGC GCGCACCCCG GGCCAGTACA GCAGGGACCC CAGCGAACTG CGAAAGACCG 93840 

CGGCGTCGGG GCCGGGGTGG GGGGGCGCGG CCCCTCCCGC GCTGAGCAGC GGCACGGCGG 93900 

40 CGGCCCCCAC GGGCCGCAAC GCCGTGAGGC TCGCGAACTG CCGTCGGAGC TCGGCCGCCC 93960 

TGTCGTCGAG CTCCGAGCCG CGCGCCCTCC GTGTGCAGGC GCGTCCCGCA GACCCACCCG 94020 

TTGATCGCCA CCCGCACGAT GGCGTCCACC AGAAAGCCCA TCGCGCGGGA GGGGCTGGTT 94080 

TTTGCCCGCC GATCCGTCAG GTCGAGGATC GCGTCGCCCG TGACGTACCA GGCCAGCGCC 94140 

562 
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TCGCCCTGCT GCAGCGTCTG GCGGAAAAAC 
ATGACCCCCA CGCGCGACAG CCCGAACGCG 
TGTCCCAGGG CCAGGGCCGA GCGCACGGAC 
CGGTCCAGGG GGAATGCCGC CTGCAGCTCC 
5 ACCGTCGCGG GAGGCCGCGC CCCGGCGCCG 
TCCTCCTCGT AGCTGAGCTC GTCCAGGAAC 
ACCCGCCCCA AAACGTCGCG TGGGTCCATC 
GTGATGGCGC TGTCCCGGCG TCCGCGAACG 
CGCGGAAAGG ACGCCCGGCG GGGGGGCGCC 

10 CCGGGGTGAC CGCGGGCCTC CCGGCGACGG 
CCCGCGGCCT CGACCTCGCT GTCGTCGTCC 
AACCCGACTC CACCGCCCTC GGGCTCGTCC 
GGGGCATGGG ACCGGGTGGA GGCGCGCCTC 
GTGGCCGGAG GCCCGATTTT TTACACACGC 

15 GTGCGAGGGG GGAGGCCCAA ACGGGGAGGT 
GGGTAGGAAC CGGCACGACG GGAACAGAGA 
GGGCCTCGTC CCCACGCAGA CCCGCGGGCA 
TTATACGCGG ACCCCAGCAC CACGAGCCGT 
TCGTAGGCGC GACTAACGCC CAACCCAACG 

20 ATTTCTTTCA TGGTCCCGTA ATAAACAGCC 
CAATGTTTAT TGCTGTGGTT GCGAACCCTC 
CGGTGGTGGG GGGGGGGGGC GCGCCGCCCG 
TGCCCATGGC ATCGGTAAAC ATCTGTTCAA 
ACGAGACGGG GTCGTGGGTC ATTCCCGGGG 

25 AGTCGAAGTC GTCCAGGGCG TCGGCGGGCG 
GTTCGTCTCC CAGGCTGACG TCGGTAATGG 
GTCCCGCGGA GAGAAACGAC ATGCGCGGCG 
CGTCGTCCGG GAGGTCGAGC AGGCCCTCGA 
CGCGGCTATA CGCGTGCTCC CGCATGACGG 

30 ACGAGTCCAA CTTGGCCCGG ATCAGCAGCA 
CCTGCAGGAC GGGCGGGGTC GTGAGGGGCG 
CCAGCGGGAG GTTCAGGTGC TCGCGAATGT 
CGGGAACTCC CCGCACGGTG AGCGATCCGT 
CCAACTGGCG CCAGCTCTCC AGGTCGCAGC 

35 TCTGCTCGGC GTACGCGGCC CATAGGATCT 
ACAGGACGCG CGCCAGGCGC GCGGTCTCGC 
GCATCTCCCG CAGGTCGCGG TTGCGGCCCC 
CGCTGGCGCG CAGGTACCGG TACAGGGCCG 
TCTCCTCCCG CGCCCGCAGC TCACCGCGGA 

40 GCAGCTCGTC GCGGGTGGCG GGCAGGGTGG 
CGATCGGGGA GCGCTCGGGG ACGTGCGCAT 
TCGACAGAAA CTTGCACTCC CGGTACATGT 
CCTCGTTCCA . GGTATCTAGC ATGGTACACA 
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ACCTTTGGGT 


CGGCCGGGGA 


GGCAAAGTGC 


94200 


CTATCCGGAC 


ACGGGTAGAA 


CCCGGCCGGA 


94260 


TCGTCCCACG 


CGGCGACTCG 


GGGGGTCAGG 


94320 


GGGCCCGACA 


CGCGGCCCGC 


GAGAATCTCG 


94380 


TCATCGTGCG 


CGACGGCGGC 


GGGGTAGTCG 


94440 


AGCGGCGAGG 


GCACCACCCG 


CGAACCGCCC 


94500 


GGGCCCAGGT 


AGCCTCCCCG 


CGGGGCCCGC 


94560 


GACTGGCTCC 


TGGCCGTAAC 


GGACCTGGGG 


94620 


GCCGCCCGGG 


CCTCGGACGC 


GCGTCGGGAC 


94680 


CGCGGGGGCG 


GCTCTTCGCT CGCCATCTCC 


94740 


ACGTTAAACA 


CCGCCCGCAG 


GTACCCCATT 


94800 


TCCACGGGCG 


AGTCGGCGCG 


ATGCGCGGAC 


94860 


CGGCGTACGG 


CATGCCCGCG 


CACGGACATG 


94920 


CCTCCCCGCA 


GACGGACGAG 


GAAAGGGGTG 


94980 


GGGGGGTAGG 


GGGCGGTCCC 


AGGGAGCGGG 


95040 


AACGCGACCG 


CTCCAACAAG 


GGTGGGGGGT 


95100 


AATGCGAGAA 


CGGGACCCGC 


GCGCCTGCCT 


95160 


TCTGTGACGC 


GAATCTACAC 


GACCGCGGGC 


95220 


GCACACACCC 


CCCACCCCGC 


GCGTAACCCC 


95280 


AACGCACGCC 


GCGTATGATG 


AGTTGCTTGC 


95340 


TATCGCGATA 


CAGACGGAAG 


TGAGGCGGGG 


95400 


GTCGCACATC 


CTACCCCCCA 


AAGTCGTCAA 


95460 


ACTCAAAATC 


GTCCACGTCC 


AAAGCCCCAT 


95520 


AGGGGGACTC 


CACGTCCCCC 


AGCATCTCCA 


95580 


TCATATCCAC 


CTCCTCGCCG 


TCCAGGCGGA 


95640 


GGGCGGTGGT 


GGACAGTCTG 


CGGGGGCGTT 


95700 


CCACCAGCCC 


GGCCTCCGCG 


GGAGCGTCAT 


95760 


TTGTCGATCC 


GTAATTATTT 


CTGGTCCGCC 


95820 


ACTCGCCCTC 


CGAGGTCGCA 


ACGCTGGAGT 


95880 


TAAAGTACCC 


AGAGGAGCGG 


GCCTGGTTGC 


95940 


CCCCGGGTTC 


CTCCGCCGCC 


GCACTTCGCA 


96000 


GGTTTAGCTC 


CCGCAGTCGC 


CGGGCCTCCA 


96060 


TGATAAACAT 


CAGGGGCTGA 


AACAGACACG 


96120 


AGAGGCCGTC 


GAACAGATCG 


GGCCGCATCA 


96180 


CGCGGCTCAG 


AAAGAGGTAT 


AGATGCAGAA 


96240 


GGTAGTACCT 


GTCCGCGATC 


GTGGTGCGCA 


96300 


GCATGTGTGC 


CTGGCGGTGT 


AGCTGCCGAA 


96360 


AGCAAAAATT 


TGCCAACACG 


GTCCGGTAGC 


96420 


AAAACTGCGC 


CATGGCCTCG 


TAGTACGAAG 


96480 


GGAACGCCAC 


GTCGCCGTGG 


GCGCGAATGT 


96540 


CCCCCCAGTC 


GATCACGTCG 


CTGGGCAGCG 


96600 


CGGCGTTGGT 


CGGGAACCCA GAGAACAGGT 


96660 


GCGCGGGACC 


CGCGCTGAAG 


CCCAGATCGT 


96720 
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CGAGGAGACG 
GGGCCTGACT 
CCCCGGCTGG 
CCAACAGGTC 
5 GTGCCCGAGC 
GCCGGGGCCC 
CGCGTCCACC 
CCACCGCTCC 
CCTCCCGCCC 

10 GCCCCTCGCC 
GTCTGTCGTC 
AGCGGGCGGG 
CGGTCGCGTC 
GGTTCTTGCC 

15 CTTCGTCGGT 
TGGCCGCCAG 
GGGTCCACGG 
CCTGGGCGGG 
CGGAGTCGGC 

20 CCTTAGGCGC 
CGGCGCCCAC 
CCGCGACGGA 
ACGACGACGA 
CGCCCCTTGG 

25 CTGGATCCGC 
CCCGCGGACA 
GCCCGTGCGG 
AGGGAGGGAA 
CGGGGGCCGC 

30 AACCGCAGGC 
CGGCCCGACT 
CCCAACAGGG 
GTGCGCGATC 
CGGGACGGAC 

35 TGCGCCTGCC 
CACCACCCAG 
GCCCAGGGCG 
AGCGGCCGTA 
CGTGGATTTT 

40 GGGCCTGCGA 
CGGCCCCGGG 
GGCCGTGGTC 
CGTGACGTTT 



GTTAAACAGG GCCGCGGGGG GGACGGGCAT 
CAGCCGACCG GTGGCGTACA GCGGAGGGGC 
CCTGGGGGGC GGTGGCGAAA CCCCGTCCGC 
CATGGGGGCG GTTGGGTCCG GGAATAACGA 
GCCCGGCGGC GGAGAGGGGG GGAGGGATCC 
TCGCGAAGGG AATCGCCGGG GGTGCCGTGC 
ACCCCGCATT TAAGTATCAC CCCAGTGCCG 
GGGCGGCCCG TCCCCCGCGC TCGGAAGGGA 
CTCCCGCCCC TCCCGCCCCT CCCGCCCCTC 
ACAAACGCGT GCTGACAGCG AAGTGGTTAA 
TGAACGCGGC CGGGGTCGCT ACTCGAGGGG 
GGCCCGTGCG GTCGCGGCGG CACGCCCCGC 
GACGTCCTGC GCCGCGTCGG GATTCACCAA 
CTCGCAGACC GTCACGCGAA TGGTGGTGAG 
GTGCGGCCGC GACATGTCCC ACAGCTGTAC 
GCGCCCGACC GCGGCGCAGA AGACGCGCTT 
CGCCGTGGGG CTCGGTGGGG CGGTGCTGAA 
TGTCTTGGTT CTTCCCGAGG CCGTGGGAGC 
CTGGGCGGGT CGCCTGCCGC GGGCAGGGTC 
CCCGCGCGTC ATTTTGGGGG TCCGCGCGGG 
GGGGCCCCCG GGGGGTGGAG GAGCGCGCGC 
CGCCGAACGA CGCGGTCGCG CGGTATCCCG 
GTCCCGGTAG AGGGCATACC CAGCCTCGTC 
GCGCGCGCGC ATCGGGCCAG CGCCGCGGCG 
CGGGGAGACC GGGCCATAGT ACAGCTCCTC 
CGACTTGACG GAGCGGCGAG AGGTCATGGT 
ATCACAGGGA AGGCGTCGGC GAAGCAGGCA 
AGAGGGAGAC CGGCGGGGTA CGGGAGAGCA 
GGGAGTGGCT CCCTGCGGGT TGCGGGGGAG 
GACGGGACTA ACCAATCCCC GGGGGGGCAA 
TCCGCGAGGA AGCAAAGGCC GGGGGCCGCC 
CGGGCTCAGG CTGACCCGGC GGCCAGTGCC 
ATACATACGC CCATCGAGGT CATGCCTAGA 
ACCACACCGG CGCTGTCGCC CCGGCATTGC 
GCGTTCGGCG GCTCCCCGGG CACGCCCGCG 
CGGCCCAGTC TTGCGGGTTT CCCCGTCATC 
ATCCTTGTCC AGACGGACAG CACCAACCGG 
GCTATTCGCG GGGGCGGAGT CGTTCAACTG 
ACCCCGGCAG AATGCGGGGA CTCCGAATGG 
ATGGCAATGC CGCGTGACTT CTGCGCGATT 
CCCCACGTGA TGCTCGGTCT CGTCGACTCG 
GTAGCCCCGA ACGGGACGCG CGGGTTTGCC 
CTGGACATCC GGGCCACCCC CCCGACCCTC 

564 



PCT/US97/20016 



GGGCGGCGAG 


GGCATCAGCT 


96780 


GGCTGGGGTG 


TTCTTGGGAC 


96840 


GTCCGCAAAC 


AGATCGTCGA 


96900 


TCTCGAGAGG 


CGAATGAGAC 


96960 


GGGACCCGCG 


ACAGAAAAAG 


97020 


GTCCCCGAGG 


ACTGACATCT 


97080 


CCCCAAACCT 


CGTGACTTCC 


97140 


GGCGTGTCGC 


CCCTCCCGCC 


97200 


CCGCCCCTCC 


CGCCCCTCCC 


97260 


ATCGACCGTG 


ATGCTTTATT 


97320 


GCGGCGGGGA 


CGGGAAGCCG 


97380 


GGGGCGGCCC 


CGGGCGGCCG 


97440 


CTCGTTCGCG 


CGCTGCAGGA 


97500 


GTCGAGGAGC 


TCGTTGAGGT 


97560 


CGCCGCCAGC 


CGGGCGTGCG 


97620 


GTTGAACCCG 


GCCACCCGGG 


97680 


GTGCAGCTTC 


TTGGCCAGTC 


97740 


GGGGGCGTCT 


AGGAGCACGG 


97800 


GGTCGCCGGG 


GTCGCGGGGG 


97860 


AGGGGCGTGC 


GAGCGCCCGC 


97920 


GGGGCCGGGG 


GCGTGAGAGC 


97980 


GGACTCGTCG 


TTGTCTTCGG 


98040 


ATAATGGAGA 


AAGCGAACCT 


98100 


GAAGTCGTCG 


CGCGGACTCT 


98160 


GTGGGTCCCG 


CGCGGCGCTT 


98220 


CTATCGGAGA 


CACCGGGGAC 


98280 


GAGAGCGTCG 


GAAGGCGGCG 


98340 


GCGAGGGCCT 


GCGTAACCCA 


98400 


AGTTTATAGG 


AAGTGGATAT 


98460 


CGGACAGACA 


CGCCCCGAAC 


98520 


CAACGACACG 


CCCACCCCTT 


98580 


CGCTGGCATA 


TCTGATACAC 


98640 


TAAAAGGGCA 


CCAGGACCCC 


98700 


GCGTCCCCGA 


TAACGCCGCG 


98760 


ACGAGCGCGA 


CGAACAACAG 


98820 


GCGGCGATGA 


GTCAGTGGGG 


98880 


AATGCCGATG 


GGGACTGGCA 


98940 


AACATGGTCA 


ACAAACGCGC 


99000 


GCCGTGGGCC 


GCGTCTCTCT 


99060 


ATTCACGCCC 


CCGCGGTATC 


99120 


GGCTACCGCG 


GAACCGTCCT 


99180 


CCCGGGGCCC 


TCCGGGTCGA 


99240 


ACCGAGCCGA 


GCTCCCTGCA 


99300 
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CCGGTTTCCG CAGTTGGCGC CGTCCCCGCT 
CGGGGCGCTC GCGACCGCCG GGGGGGCGGT 
GCTGGTCTAC GCGGGCGAGC TAACGCAGGT 
GGCGCCCGGC TTTCTGCCAA AGCGCGAGGA 
5 . AGCCGTGACC GTCCCGGCCA ACGGCGCCAC 
CGCGGCCGAC GGACCAGAGG CCTGCTATGT 
CCTCCTGGTC ATGCCTACGC GCTGGCCCTC 
CCTGACCGGA GTCCCGGTGA CCCTACAAGC 
GGGGACCCAC GCCCTCCCCT GGATCCCCCC 

10 GGCCTACCCC AGAGGGGTTC CGGACGCGAC 
GTTTACGAAC GAGTTTGACG CGGACGCCCC 
CTCCACTGGC ATCTAGACCG CGCCTCGCGT 
GCTCTGTTTC GCATATGCCC TGGTGTTGGC 
ACTCGGTTGT CCGTTCTGTC GTCGCTATCA 

15 CCGAAACCGG TCGACGTTTA TTCACCACAC 
GCCTCGGTCG ACGAGGCCTG GCGTTTGGGG 
GGGTCCGCGG GGGGCACGGG CCCGGGGCGA 
GGGACCGACG CAACCTCCGG GGCTTGTGCC 
CCGAGCCCCG CGGTGCGGGT CCCTCCGGCC 

20 TCCGCGATCG CCACGTCCTC CATGACCACG 
GAGACGAGCA CGTCCGCCGA CTTGTCCGCG 
GTGGCCATGC ACGTGTCCGC CAGGCGGCGC 
TCGATGGTGG AGCCCTCGAG TCCCGGGTGG 
CAGGCGTGGT ATGTGCGGGC CAGGGCGCGC 

25 GACTCTAGGG CGTCGTCGAG CGTGATGGGG 
GCCTCCTGCA GCCGCGGGTC CNNNTCCGAG 
TGTTGTTCCT CGGGGCGCGT TCCCCAACCG 
GAACACGCGC GCGGCTCTGC GCCGGCTTTC 
GCGATGGGGA CGGAAGACTG CGATCACGAA 

30 ATGGCGCTGT ATGCGACCGA CGGGTGCGTT 
TGCCTGCTGG GGGCCGAGCC GTTGTATATA 
CCCAATGGCC CCACGGGCGC GCCCACCGAA 
TACCGGGATG CGGGGGGGCT AAATGGCGAT 
ACGGAAGTGG GCGTGACCCA CCACCCGAAA 

35 TTCGAGCGAG CGGACGACGT CGCCGTGCTC 
CTCCCGGCCC ACATCACAGC AACTCTGGAC 
ATCATCATGG CTCTCACCGT GGCCATAGTC 
AGCACCGCTC CCCTGTATGA GCCCGGCGAA 
CTGGGGCAGC GCGGCCTCAC CACGCTGTTC 

40 TACCGCCGGG CGTATTATGG GAGCGCCCAA 
CCGGACGAAA AGAGCCTGGT GCTGGCCGCT 
GGGGGCGCCG GAGCCACGTA CGATCTGCAG 
ATCCCCCACG ACCCACGCCC CGACACCCTC 



PCTAJS97/20016 



GGCAGGGTTA 


CGAGAAGATC 


CTTGGTTGGA 


99360 


GGCCCTGCCG 


GCCAGACGGC 


GCGGGGGATC 


99420 


GACCACCGAG 


CACGGCGACT 


GCGTGCACGA 


99480 


GGACGCAGGC 


TTTGACATTC 


TCATCCACCG 


99540 


GGTCATACAG 


CCGTCCCTCC 


GCGTATTGCG 


99600 


GCTGGGGCGG 


TCGTCGCTCA 


ATGCCAGGGG 


99660 


CGGGCACGCC 


TGTGCGTTTG 


TTGTATGTAA 


99720 


CGGGTCCAAG 


GTCGCCCAGC 


TGCTCGTCGC 


99780 


CGACAACATC 


CACGAGGACG 


GCGCATTCCG 


99840 


CGCCACCCCC 


CGAGACCCGC 


CGATTTTGGT 


99900 


CCCAAGCAAG 


CGGGGGGCCG 


GGGGGTTTGG 


99960 


CGGGCCAGAT 


GGGGCCCCGG 


TCAATAAAGA 


100020 


GGTTTTTTTT 


TTGTTGTCTG 


TCTGCCCGGC 


100080 


CATACGCACA 


AACACACGGG 


TAGAGTGGAA 


100140 


AGAAACACAA 


GCTAAGCGAG 


AAGGAGGGGG 


100200 


GCGGACGTGC 


GATGACGTGG 


GTCCGGTGTA 


100260 


ACGGGGGATC 


TGTCGCCGGC 


GTGGGTGACT 


100320 


CTCGTAGGCC 


CGGGGGGGGC 


CTCGGTCGCT 


100380 


AGAGCCGAGG 


TGGAGAGACC 


AAGGGCCCGC 


100440 


TCGCTCTCGG 


CCATGCTCCG 


AATGGCCTGG 


100500 


GCCCCCACCG 


ACATGTACAT 


CTGCAGGATG 


100560 


ATCTTGTCCC 


GATGCGCCGC 


AACGGCCCCG 


100620 


TGGCGCGCCA 


GCCTCTCGAG 


GTTGACCATG 


100680 


GCCTTCACGA 


GGCGCCGGGT 


GTCGTCCAGC 


100740 


GCGGGCAAAA 


GCGCATTGAC 


CACCGCCAGG 


100800 


GGCGGAGCCG 


CGGCCCGAAT 


CATCTCATAT 


100860 


CACAGCACCC 


CGAGCAGGGA 


CGCCATCCCG 


100920 


CCCCACCCCA 


CCCCCTCCGG' 


GTTCGCAGGG 


100980 


GGGCGGTCGG 


TTGCGGCTCC 


CGTGGAGGTT 


101040 


ATCACCTCCT 


CGCTCGCCCT 


CCTCACAAAC 


101100 


TTCAGCTACG 


ACGCGTACCG 


GCCCGATGCG 


101160 


CAGGAGAGGT 


TCGAGGGGAG 


CCGGGCGCTC 


101220 


TCATTTCGGG 


TGACCTTTTG 


TTTATTGGGG 


101280 


GGGCGCACCC 


GGCCCATGTT 


TGTGTGCGGC 


101340 


CAAGACGCCC 


TGGGCCGCGG 


GACCCCATTG 


101400 


TTGGAGGCGA 


CGTTTGCGCT 


CCACGCTAAC 


101460 


CACAACGCCC 


CCGCCCGCAT 


CGGCAGCGGC 


101520 


TCGATGCGCT 


CGGTCGTCGG 


GCGCATGTCC 


101580 


GTGCACCACG 


AGGCGCGCGT 


GCTGGCGGCG 


101640 


AGCCCCTTTT 


GGTTTCTGAG 


CAAATTCGGC 


101700 


AGGTACTACG 


TACTCCAGGC 


TCCGCGCTTG 


101760 


GCCGTGAAAG 


ACATCTGCGC 


GACCTACGCG 


101820 


AGTGCCGCGT 


CCTTGACCTC 


GTTCGCCGCC 


101880 


565 
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ATCACTCGGT TCTGTTGCAC GAGCCAGTAC 
CTGTATGTGG AGCGCCGCAT CGCCGCCGAC 
ATCGCCCACG ATCGCAGTTG CCTGCGCGTG 
CTGGCCCACT TTGAGTGCTT CAGCCCCCCG 
5 ACCCACGACC CCAGCCCCGC GGCCAGCACG 
GTGGAACAGT TCTTCCGGCA CGTGCGCGCC 
AACGTCACCC CCAGGGAAAC CGCCCTGGCG 
CGCACGTATG CCCCGGCGGC CCTCACGCCC 
TCCACCAAAA TGATGGGACG TCTGGCGGAA 

10 CCCGCGTTCG CACCAACAAC CCCCGGGGAC 
ACCTGCGGAA TCGTCAAGCG CCTCCTCAAG 
CCCCCGGCGA TCGCGGCTCT CATGCAGGAC 
AGGATTACCA TGTCCCCGAC CGGCCAGGCG 
CGCGTGACGC GGGACGCGCG CCCGCCGGAA 

15 CCCGAGCCCG GCGCGCTCGG CCGGCGGCTC 
CCTCCCCCCG GGCGGCCTGG CCGTCGGGGG 
CAACGCCGCG CTGGCCGTTA CGAACATCAT 
CGTCCCCTTT CCCCGGCTCC ACGAGGCCCT 
GGTTCAGCTG TTGTTTCCCG CGGCCCGCGT 

20 CAAAAGCGCC TGTCGGCCCC GCGCGCCGCC 
TGGCGACGAC GGCGACGGGG ACTGGTTCCC 
GTGGGAGGAG GACACGGACC CCATGGACAC 
CGCGTACCTC GACCTGCTAC ACGAACAGAT 
CGTCGTGTGT TCCTGCGCCG ACAAGATCGG 

25 GTACGTTGTG CACGGCTCCC TGACGATGCG 
GCTGTTGGAC CGCGACTTCG TGGAGGCCGT 
CGATACGGGC GTGTACGCCC ACGGCCACAG 
CCCCGACGGC TCCGCGTGCG GCCGGTTATT 
GGACGTTCCG GCGTTCGTCG CCGCGCACGC 

30 GCCCATGTTT TCCGCGGCCC CGCGGGAGAT 
TGTCAGCTTT TTCGAGAAGA AGGCGTCGCG 
GACCCTGACG GAGGTTCTGG GCCGCTACGA 
GGGGTTCGCG TCAGAACTGC TGGGGCGAAT 
GCACGCGCGG GAATATCAGG CCGTGTCCGT 

35 CCTGCTGCAG CTGATCCCCG GCCGCGGCGC 
CAAGCACGGC AGGGCAAGTC GCGCGACGGC 
CAACAACCGC CTATGCGCGT CCCTGTGTCA 
CCGCCTGCAC ACGCTGTTTA CCGTCGATGC 
CAGCACCTCA CGACCGTCAT CTTCATAACG 

40 GTCTTTGGTG CCAGTCCGCT CCACCGATGT 
AACGATACCG CCCTCGTGTG GATGAAGATA 
ACCGCCCCCC CCGGCGGGGC ATGGACCCCC 
GAAGGTCGGG CCGTGTCCCT CCCGGCCATC 



TCCCGCGGGG 


CCGCGGCCGC 


TGGGTTTCCG 


101940 


GTACGCGAGA 


CCGGCGCGCT 


GGAGAAGTTC 


102000 


TCCGACCGGG 


AATTCATTAC 


GTACATCTAC 


102060 


CGCCTGGCCA 


CGCATCTCCG 


GGCCGTGACC 


102120 


GAGCAGCCCT 


CGCCCCTGGG 


TCGGGAGGCG 


102180 


CAGCTGAACA 


TCCGCGAGTA 


CGTAAAGCAA 


102240 


GGAGACGCGG 


CCGCCGCCTA 


CCTGCGCGCG 


102300 


GCCCCCGCGT 


ACTGCGGGGT 


CGCAGACTCG 


102360 


GCAGAAAGGC 


TCCTAGTCCC 


CCACGGCTGG 


102420 


GACGCGGGGG 


GCGGCACTGC 


CGCCCCCCAG 


102480 


CTGGCCGCCA 


CGGAGCAGCA 


GGGCACGACG 


102540 


GCGTCGGTCC 


AAACCCCCCT 


GCCCGTGTAC 


102600 


TTTGCCGCGG 


CGGCGCGGGA 


CGACTGGGCC 


102660 


GCGACCGTGG 


TCGCGGACGC 


GGCGGCGGCG 


102720 


ACGCGCCGCA 


TTTGCGCCCG 


GGGCCCCGCG 


102780 


CCAGATGTAC 


GTGAACCGCA 


ACGAGATCTT 


102840 


CCTGGATCTG 


GACATCGCCC 


TGAAGGAGCC 


102900 


GGGTCACTTT 


AGGCGCGGGG 


CGCTGGCGGC 


102960 


AGACCCCGAC 


GCCTATCCCT 


GTTATTTTTT 


103020 


CGTCTGTGCG 


GGCGACGGGC 


CCTCGGCCGG 


103080 


CGACGCCGGT 


GGCGACGACG 


GCGACGAGGA 


103140 


GACCCACGGC 


CCCCTCCCGG 


ACGACGAGGC 


103200 


ACCAGCGGCG 


ACGCCCAGCG 


AACCGGACTC 


103260 


GCTGCGCGTG 


TGCCTACCGG 


TCCCCGCCCC 


. 103320 


TGGGGTGGCG 


AGGGTGATCC 


AGCAGGCGGT 


103380 


AGGGAGCCAC 


GTAAAGAACT 


TTTTGCTGAT 


103440 


CCTGCGCTTG 


CCGTATTTCG 


CCAAGATCGG 


103500 


GCCCGTCTTC 


GTGATCCCCC 


CCGCGTGCGA 


103560 


CGACCCGCGG 


CGCTTCCACT 


TTCACGCCCC 


103620 


CCGCGTCCTC 


CACAGCCTGG 


GCGGGGACTA 


103680 


CAACGCCCTG 


GAGCACTTTG 


GGCGACGCGA 


103740 


TGTGCGGCCC 


GACGCCGGGG 


AGACCGTGGA 


103800 


AGTCGCGTGC 


ATCGAGGCCC 


ACTTTCCCGA 


103860 


TCGCCGGGCC 


GTCATTAAGG 


ACGACTGGGT 


103920 


CCTGAACCAA 


AGCCTCTCGT 


GTCTGCGCTT 


103980 


CCGGACCTTT 


CTCGCGCTGA 


GCGTCGGGAC 


104040 


GCAGTGCTTT 


GCCACTAAAT 


GCGATAACAA 


104100 


GGGCACGCCA 


TGCTCGCGGT 


CCGCTCCCTC 


104160 


GCCTACGGCC 


TCGTGCTCGC 


GTGGTACATC 


104220 


ATTTACGCGG 


TGCGCCCCGC 


CGGGGCACAC 


104280 


AACCAGACGC 


TGTTGTTTCT 


GGGCCCGCCG 


104340 


CACGCCCACG 


TCTGCTACGC 


CAATATCATC 


104400 


CCCGGCGCCA 


TGAGCCGCCG 


GGTCATGAAC 


104460 



WO 98/20016 



PCT/US97/20016 



GTGCACGAGG 
GTCGGTTGGT 
GTCGTGAGTC 
CGCATAGTGT 
5 GAGCTATCCG 
TTGTACCACC 
GCCCTCGGTC 
CCCCTGTTTC 
TATTTCATCC 

10 CGCTCCAAAG 
GCCGTGCGCC 
GAACAGGAGA 
GGCGGACCCA 
GCACGTCGCA 

15 CCTCCCCCGG 
GTGCTTCCGC 
AATTAAATAC 
GGGCGGTGAC 
GCGGGTATAA 

20 GTTGCGCGTC 
CTCGGAGGAC 
TGCTAATCGA 
GGGACGAGGA 
CGGACGAGGA 

25 TTCCCAAAGG 
GCCCGGCAGC 
GCCTCGGGAC 
TCCAACCCCC 
GCCGGGGTCG 

30 CCAGAAACGC 
GCGCCACCCA 
GGCCACGAGG 
ACGCGGACGG 
CCGCCGTTCG 

35 AAAGCTTTGG 
CCGCGAACAG 
CCCGTCGGGT 
CAGCCAACCC 
AAAATCTCAT 

40 TTCACCACAA 
TGGAAAACCT 
TGTGCGGGCT 
CCTTTGTGTT 



CCGTAAACTG 
TTCTGTATCT 
CCGCGCACAG 
CGAGCGTGTT 
TTCAACGCCA 
GCCCGGCCGT 
TCATCGTCGG 
TAACAATCAC 
TGCGGCGGGA 
GGTGGTCGGG 
TGTGCTATAT 
TTCAGCGGCG 
GTCGCCCATG 
TGCAAATTAA 
TCCAGCACAG 
ACCCCCGCCT 
ATAAAACCCA 
GCCCGACGGG 
GGGCAGCCAC 
GGTGCCGCTC 
ACCCGCCATC 
CCTAGGATTG 
GGGCCGCCGC 
CATGGAAGAC 
TCCCCCGGCC 
GCGGCGGGGA 
CAGGCGGTCG 
GTCGACCAAG 
ATACGGCCCC 
CCACAACCAA 
CGGCGAGGCG 
CACGCGCCAG 
CCGCGCCCCG 
GGCGGTTCTG 
ACGCAGTGCC 
CCCCTGGGCT 
TTCCTGGGAA 
GCGGGCCGCG 
CGAGGCCCTG 
TCTGCCGCTC 
CGCCACGCGC 
GGACGACCTG 
GGTCATCCTG 



CTTGGAGGCC 
AGCGTTCGTC 
CATGGTGGCC 
CTTGCAATAC 
GACCCTGGTG 
TGGCGTCATC 
CACCGCTCTC 
CACCTGGTGT 
CTCGGCCCCC 
CGTCTGCGGG 
CGCCGTCGTG 
CCTGTTTGAT 
CAAATTAAAT 
AATCGTGCAC 
GCAGGCTCGT 
ACGCGTGTAC 
CCCTCGGCGT 
GAGGGACAAG 
CGGCCCACTG 
CTCGATTCGG 
CCAGCCCCGG 
GACCTGTCCG 
GACGACCCCG 
CCCTGCGGAG 
CGCCCCGAGG 
GCCGACGATC 
GCTTCCCCCC 
GCACCGCATC 
GGCGGCGCCG 
GGGGGTCGCC 
CGGCGCGGAG 
GCCCCCCCTC 
GTCCCGGAGC 
CGATCCATAT 
CTGGTCATGC 
CCCGTGCTGG 
ACCCTGGTCG 
TCGACAGCCA 
GCGTCCGCGG 
CGCCCCCAGG 
CTGCGCCCCT 
TGCTCGCGGC 
GCCCGCCTCG 



CTCTGGGACA 
GCCCTTCACC 
CCGGCGACCT 
CCCTACACGA 
CAGCTGTTCG 
GTGGGCTGCG 
ATCTCCCGGG 
TTCGTGTCCA 
AAAAACGCGG 
CGCTGCTGTT 
GCCGGGGTGG 
CTGTGACGTA 
ACACGACCCG 
AGAGCCGATC 
CCGACTTCCG 
GCGAAGGCGG 
CCGATTGGTT 
GAGGAGTTTC 
GGCGCTGTGT 
ACCCGGCCAC 
CGACCTACAA 
ACAGCGAGCT 
AGTCCGACAG 
ACGGAGGGGC 
ACGCCGGCAC 
CGCCACCCGC 
GGGAACCGCA 
CCCGAGGCGG 
ACTCCACACC 
ACCCCGCGTC 
GGGAGCAGCT 
CGCTGATGGC 
GAAAGGCGCC 
CCGAGCGCGC 
AAGACCCCTT 
CCACCCAAGC 
CTCACGGCCC 
AGGCCATGCG 
ATGAGACGCT 
ACCCTATCAT 
TTCTGCAGTG 
GACGCCTGTC 
CCAACCGCGT 

567 



CCCAGATGCG 
AACGACGATG 
ATCTTTTGAA 
AAATCACCCG 
AGGCGGATCC 
AGCTGCTGCT 
GCGCCTGCGC 
TCATCGCCCT 
AACCAGCGGC 
CCATCATCCT 
TGCTTATGGC 
ACGCCTCTTC 
CCTCGGGCCT 
CGGCCTCGGG 
CATACACCCC 
ACCCAGACCT 
TCTGGGGACG 
GGAAAGCCGG 
GCTGCCGTGT 
TCTCTTCCGA 
CATGGCTACC 
CGAGGAGGAC 
CAGCGGGGAG 
GGAGGCCATC 
CCCCGAAGCC 
GACCACCGGC 
CGGGGGGAAG 
GCGGCGAGGT 
AAACCCCCGC 
GGCGCGGACG 
CGACGTCTCC 
GCTGTCCCTG 
CTCTGCCGAC 
GGCGGTCGAG 
TGGCGGGATG 
GGGGGGGTTT 
GAGCCTCTAC 
CGACTGCGTG 
GGCGTGGTGC 
CGGAACGGCG 
CTACCTGAAG 
GGACATTAAG 
CGAGCGCGGC 



CCTGGTGGTC 

CATGTTCGGC 

CTACGCCGGC 

CCTCCTCTGC 

GGTCACCTTC 

CCGCTTCGTG 

GATCACATAC 

GACGGAGCTG 

CCCCAGGGGG 

CTCCGGTATC 

GCTTCGCTAC 

CGTTGGAAGA 

ACGCACCCTC 

TCTGCTTGCC 

ACCCTACCGC 

GCCGTATGCT 

GCGGGGGCGG 

CCCCGGTCGT 

GCCGACCCCG 

CACGCGCCCC 

GACATTGATA 

GCTCTGGAGC 

TGTTCCTCGT 

GACGCGGCGA 

TCGACGCCTC 

GTGTGGTCGC 

GTGGCCCGCA 

CGCCGCCGGG 

CGGCGCGTCT 

GACGGCCCCG 

GGGGGCCCGC 

ACCCCCCCGC 

ACCATCGACC 

CGCATCAGCG 

CCGTTTCCCG 

GACGCCGAGA 

CGCACATTCG 

CTGCGCCAGG 

AAGATGTGCA 

GCCGCCGTGC 

GCCCGAGGCC 

GATATTGCCT 

GTGTCGGAGA 



104520 

104580 

104640 

104700 

104760 

104820 

104880 

104940 

105000 

105060 

105120 

105180 

105240 

105300 

105360 

105420 

105480 

105540 

105600 

105660 

105720 

105780 

105840 

105900 

105960 

106020 

106080 

106140 

106200 

106260 

106320 

106380 

106440 

106500 

106560 

106620 

106680 

106740 

106800 

106860 

106920 

106980 

107040 
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TCGACTACAC GACCGTGGGG GTTGGGGCCG GCGAGACGAT GCACTTTTAC ATCCCGGGGG 107100 

CCTGCATGGC GGGTCTCATT GAAATACTGG ACACGCACCG CCAGGAGTGT TCCAGTCGCG 107160 

TGTGCGAGCT GACGGCCAGT CACACTATCG CCCCCTTATA TGTGCACGGC AAATACTTCT 107220 

ACTGCAACTC CCTATTTTAG GCAAGAATAA ACATATTGAC GTCAACCCAA GTGGTTCCGT 107280 

5 GTGATGTTCT TGGCGCGCGC GGCGGGTGGG GCGGAGACTC CGGGGCGATG CCGGCGTGCG 107340 

CGTGGGAGGA GGGCGATGAC CCACCGGATA AATGTGGGGC CCCGGCCCGG CCCGCTTCAT 107400 

AGCGCGTCCA GGAACTCACG GCAGACGCGT ATTCACCGAC CCCCCCCCTC GCAACATGAC 107460 

AACGACGCCC CTCTCGAACC TGTTTTTACG GGCCCCGGAC ATCACCCACG TCGCCCCCCC 107520 

GTACTGTCTG AATGCCACGT GGCAGGCCGA AAACGCCCTG CACACGACCA AAACGGACCC 107580 

10 CGCGTGCCTG GCCGCGCGGA GTTATTTAGT CCGCGCCTCC TGCTCGACCA GCGGCCCCAT 107640 

CCACTGTTTT TTCTTTGCGG TGTACAAGGA CTCGCAGCAC TCCCTTCCGC TGGTTACCGA 107700 

GCTCCGCAAC TTCGCGGACC TGGTCAACCA CCCGCCCGTC TTGCGCGAAC TAGAGGATAA 107760 

GCGTGGGGGG CGGCTGCGGT GCACGGGCCC ATTCAGCTGC GGAACCATCA AGGACGTCTC 107820 

CGGTGCATCC CCCGCGGGGG AATACACGAT AAACGGTATC GTGTACCACT GTCACTGTCG 107880 

15 GTATCCGTTC TCCAAAACCT GCTGGCTCGG GGCATCCGCG GCCCTACAAC ACCTTCGCTC 107940 

TATAAGCTCA AGCGGCACGG CCGCTCGCGC GGCAGAACAG CGACGCCACA AAATCAAAAT 108000 

CAAAATCAAG GTATAACCCA CCCCCTTCCC TCCGAGTCCG TATGCAACCT CATTAATAAA 108060 

GAGTGAGAAC CAACCAAAAC AGACGCGGTG TGAGTTTGTG GGTTATAGGA ACCCGGTAAA 108120 

TACCACGCGA CGAACCAGCG TGTGTGTTAA CGCGACTTTT ATTCGTTGTA TCGCGGGAGG 108180 

20 GGGG AAGCTT ACCGCCAAAG GAAGGCCAAG ATGATAACGA CGACCACCGC GACCACCCCA 108240 

AAAACCGCAT GACGACACGT CCCGCCACAC CACCCTGGGG CTTGGGGCGT GTCGGAGCTC 108300 

GACGCACAGC GGGCCGCGCG TTGGGCCCGG TACAGCTCTC GCGAATTGAC GAGCGGGGGT 108360 

CGCCACGTGC GCGAGCTTTG CACGCGGGGT TGGTCGGCCG GCCCCACGGA CCCGCCCGGT 108420 

GGCTCGGTCG GACATGCGGC CATGACCATG GCGTAGGTGG GGGGGCGATC CGAGGTCGCC 108480 

25 TCTGCGTAAG TAGGGAGGCC CGACGGGAGG TCGCCTCCCA CGCCAGGGTG GGCCCCAATC 108540 

ATAGTTTCCG GTAGAAACAG GGGGGTCTCC ACAAACAACC CCCCTGGGCC AAAGCTCCGG 108600 

CGCCGCGCCC GTCGTTCGGC GCGGCGCCTG GCGCGCCGAG CGGCCCGCCA GGCGGCGCGG 108660 

CGCGAGCGGC CACGCTCACA CACCTCGCCG TCACCGGAAG AAGCCGGTGA AACAAGCCCA 108720 

ACCGGCGACG TCCCTGCAGA GTACGGTGGA GGCGAGTCCG TGGGGGTGTC GATATCAATA 108780 

30 ACGACAAACT GGCCCGCGCT CGCGCCGGCC ACACTCTCGT ATGGGGGCGG GGCGTCAATC 108840 

ACGCTATCAT CTCCGTCATC CCTGCATGCG TGGGCATGCC CAGCCCCCAA CGCCATGGTG 108900 

GGGATTCGCG GCTCAGAAGC CTGCATGTCG TGTGGTCGGT CGTAGTCCAA CGTGCCTCCC 108960 

CCACCCACCA CACAGCCGGT CCCCACGCCG ACCACTAGAC CGCAGACGTC GCCCAACCGA 109020 

GGTCCCCGTG CACAGACCGC GCCTTTTATA GCCCCAGGGG TTGCTAATTA ACGCACGCAT 109080 

35 GCAGACGCAA TTTATTTTGC TCCCCCGCGT CCTCCCCTCC CCCGCGTCCT CCCCTCCCCG 109140 

TCCTCCCCTC CCCCGCGTCC TCCCCTCCCC CGCGTCCTCC CCTCCCCTGC GCACACGTGA 109200 

TAGGTCTTGG GAACCCGAGG GGCGACGCGG GGAAAGCGCG CCCCCGCCCG GCCGCCGCGC 109260 

GCCCCCGCCC GGCCGCCGCG CGCCCCCGCC CGGCCGCGCG CCCCCGCCCG GCCGCCGCGC 109320 

GCCCCCGCCC GGCCGCCGCG CGCCCCCGCC CGGCCGCCGC GCGCCCCCGC CCGGCCGCCC 109380 

40 GCGTCGCGCC GGCGCCCCCT CCCGGCGCTT CCGGGGCCTT TCCTTCCTTC CCCGCCGCGA 109440 

CCCCGGCCCC GCCCCACCGC CCCGCCCGGC AGGGGGGCCC CGGCGCCGCG CAGAACACAC 109500 

AGACGAACAC ACGGTGGCGA TCTTTTCTTT ACTTCGGCAG ACCAGCGAGC CCCGGCCCCG 109560 

GCCCGCGCCC CGCCGCCACA CCCACGGCAC CCCCCCCGCC GCCCACCCCG GGGTCCACAC 109620 

568 
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AGGAGCGCGC GGGCGGCAGA AACGCGGGCG 
. GGGACACGAA AACACACCCA CGACACTCTC 
GGCGGGATCG CGGCGAGACG CAGCCGGGCC 
CG.CGCCCGCA GCCTCCGGCA GCACGCCGAC 
5 GCGGTGGGGG GCGTGGTGGT GAACGATGGG 
GGCGGGCGGG CGAAGGAAGG GGGGGTGGTG 
GATGGAAGGG CAGAAGATGG GGAGTCCCGA 
CTCCGGCCCT CCGCGAGTCC CGACGCCCCC 
ACCGCAGCCG GAGAGGCCGA GCGGGGAGTG 

10 AGAGAGAGAG AGAGAGAGGG GGGGGGGGGG 
CGGAAAAGCA GCAAGAGGGG GGACGGGGCG 
CGCGGCCGCA GCCGAGCAGC GCCGCGGGCT 
CGCCGGCCGC GGCGGAGAGA ACCCCTGTGT 
GGGCCGCGGG CCAGCAGACG GGCCGCGGCG 

15 CCCCCGCGGG CATCCGGCGG CCGGCCCCAC 
GGGGGCGCGC CAGCTGAGTG CTCTGCGGTT 
CGCCGGAGAG CCCGAGCCCC GCCCGCGTGT 
GTCTTTATAA AACCGGGGGC GCGGCAGCAA 
GGACTCCGGA GAAGGAAGGC TGCTCCGCGC 

20 ACCTCCCCCT CTCTTCCCCC TTTTTTCCCC 
GTCCGCCTCT GCCTCGGGGA CCCCCGGGCG 
GGCCGGAGGG GCCCCCGCAC CTCGGCGGCC 
GCGCGAAAGG GGCCCCCGGA GGCTTTTTTC 
CCGGCGCCGG TCGGAAGGCG TCCCCCGCCC 

25 CGGGGGCCCC GGGGCCCCGG GCCGCGCCGG 
CCCGGCCGCC CCGCTCCCGG GCCCGACCCT 
TCCCGCCGCG CCCCTTCCCT CTTCCTTCTC 
CATTTCCCCC CCCCCCCCCC GGCCGCCGCC 
CCGCGCCGCG TGAGCCGTCC GCCGGGGGAC 

30 GTGTCTCGTG TGAGAGAGCG GGCCCCTCGA 
GTCGTACAGG TGAGCTTCTG CTGAGGCGGC 
GAGAGCAGGG GTTGGGGGAA AACTGTTCTT 
GTCCCAGAAA GGGCAGGCAG GTCAGCCGCA 
TCTAGGTGTT TTTGTTTTTG TTTCTGTTTT 

35 GGCGTGTTCG GATCCACCCC CCCCTTTCTC 
CCCCCCCCCC CCCGTGGTGT CGTCCGGGGG 
GGGCCCATAC GCCCACCGCC CCCACGCGCC 
GCGTGCCCGG CCACAGCCGT GGGTGTGGCG 
GGCGGGGGGG TGGTGGTGGT AGTGGTGGCG 

40 TAGGGAAAGG TAGGCACGCG CGCGGTGTGT 
TCGTGTTGTG TCGTGGTGGG CCGTGTTGTG 
AACGCGCGAG CCCCCTCGCC CCGATGGGAG 
GCGTGGCGGG CAGGTGTGCG GGCGGGGTGG 



CGGCGGCGGT CGGGGTGGGA GTGGTGGTGG 109680 

CCCCCACCCC GACCGCCGCC GCGCCCCACC 109740 

CCCCCCCACC ACCCGCCCAC CCACCTACCC 109800 

CACCGCCGCC ACCCCCCAAA CAGCCAAGGC 109860 

GGGAACACGG GGGGGAGGGG TCCGGGGCGA 109920 

GCGGCGGCGG TGGAAAGCGG AAAAACGGAG 109980 

TCCTCCTCCT GCATCCCCTC GCCTTCCATT 110040 

CCCCCGCCGC CCGACGAAGG AGACCCAAGC 110100 

GGCGGCCGGG CGGGAGGATG GCGGAGAGAG 110160 

AGAGGGAAAG CAACGGGAAA GAGAGGCGCG 110220 

AGCCGGGCAG AGTGCGGAGC CCCCGGAGCC 110280 

CCGGGGCCGG GCCGGGCCGG CAACGCCCCG 110340 

CATTGTTTAC GTGGCCGCGG GCCAGCAGAC 110400 

CCAGCGGCCC ACGCCTCCCG CCGCATTAGG 110460 

GCCCTTCCAT TAAACACTCC CACGTTGGGG 110520 

GCGGGCGCCG TGCCCGGAGA TCCATTAAGC 110580 

TGCTGTGGGC ATTTCTGCTG CGTCATCCCT 110640 

CGAACACAGG GGCCCGCCGC CGATCGAGAG 110700 

ACCGGCGCGC CCTTCTCCTC TCCCCTCCCT 110760 

CGCCTCCCGT CTTCTTCCGC GCCTCCGAGG 110820 

GGCCGGGGCT TGGCCGCCGA GGTGCGCCCC 110880 

GCCCCCTCCG GCGCCGCGCG TTCGCGAAAG 110940 

GATTCCCGGC CGGGGGTCCC GGGTAGCCGC 111000 

GGCGGTCNGG NNNNGGCCCC CGGCGGAGCG 111060 

CGGCGTTTCC GCGTTCCGTT TCTTCTCCCT 111120 

CGCCCCTTCC CTTCTCCTCG TCTTCCCCCG 111180 

TCTCTCTGTC TCGCTGTCTC GCTCTCCTCA 111240 

GCCGCCCTCT GCCCGCGTCC CACCGAGACG 111300 

CCAGGCTCCG GGGGGGGGGC GCGCCTGCGT 111360 

ACGCCGCGCG TTCTCGCAGG TAGGTTTAGG 111420 

GGGAGAGGGG GGGGCGGGCG GAAGAGAGAA 111480 

CCTCCCCCTT TCAAGAAACA CGAGGCGGGG 111540 

CCGCCCGCGA GCCAACCCGT ATCCTTTTTT 111600 

TGTTTGTTTT GTTATTATTT TCGCGGATCC 111660 

CTTCCTCTTC CCTTCCACCC ACCCCCGTTT 111720 

CGTCGTTCCC AGGGGGGGCA. GGCGCGGGTC 111780 

GGTCACCCCC CCCCCAACAA CCCCAAAGGC 111840 

CCCGTCCCCT TCCTCTACCG CGTGGGCGCG 111900 

GAAGGAAACG GGCCGGGGGG CCGGGGCCGC 111960 

CGACTTGCAT GCCCCGCAAA ACGCGTCGTG 112020 

GTGGGCCGTG TGGTGTGGTG TGGTGTTGCG 112080 

TCTCCCCGCA GCCAGGGTAA GGAGGGGCGG 112140 

GGTGAGTGCG GTTGCATGCC TCGGGTCTCC 112200 
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TCTTCCTGCT CCTCCTCCTT 
GTGTGCGGGC GGGGTGGGCG 
CCTCGGGTCT TCTCTTCTCC 
GCATGCGTTG TGATTCAACC 
5 TTCCCTGGCC CCTGGCTTCG 
GCCAGGCGCC CTCCCACTCC 
CTCTGCACGA TCGGGCCCCC 
CCCGCCCACC CACCAAGACA 
ATAAAGACCA ACAGGCCTCG 

10 GCCCCGACCA CCCTCGCAGC 
GCCGGGGGCG CGACGACGCG 
CCTCCCGCCG GGGCTCGGGC 
GTAGCGCCCC CAGAGCTCGC 
CCTGACGGAC GGGCAGGACC 

15 CCTGGTCTGC CCCTTGCACC 
GTGGGACGAG GGGCTCGTGC 
GCGCAGCCTC ACCTTGTCGG 
GTGGGACCAC CCGCGCGACC 
CGTCCCGCGC GGGCGCGGCC 

20 TTGTGAGCCG AACGACGGGC 
AGCCCCCGTC CCCCTCCACC 
CCTCACCCCC CCCTGCCTGT 
GAAAGACCGG AACAACTTTT 
ATTTCAAAAC ACTACGAAAA 

25 GTCCCCCCGG CCCCTACCCC 
CCTCCTCCGC CTCCTCCTCC 
CCTCCTCCGC CTCCTCCTCC 
GCTGCCTCTG CGGCTGCCCC 
CGCCAGCGGG CTCAGGCTCA 

30 GACCCCCCGC TCCCCGCTCG 
CGGGCTCAGG CTGGGCGCGG 
AGCCCGGGGC CCCCCCCTCC 
CGGCAGCGGG GCGCGCAGGC 
TCGCGCCTGC GCCGGCCCGG 

35 GCCAGCGCGT ACTGGGTCCG 
CGCTGGCTGC GCGCGCCCAG 
CCGTGGTTGT GGGGGCCACG 
GCCGGGCCCC GGGCGCCCCG 
CCCGAGGACC CGGCGCGCCC 

40 CAGAAGCGGG TGGGCGGCGC 
GTGAGAGCCG TGCTGCCGGC 
GGGAGTGATT TCCCTTGTTT 
ACCTAGGGGG CCCGACGTCC 



TCTCCCAGCC AGGGTGAGGA 
CCGGGGCGGG GGTGGGCACG 
CTCCTCCTTC CTCCCACCCG 
GCCCTCGCCC CCGCCCCACT 
CGCCGGTGGT GCGGCTGACC 
TGCCCACCAC CCCCCGGGTC 
CTCCCTGTCA ACACGGACAC 
GGGAGCCAGA ACGCAGGCCG 
GGGGTGGGGG CGGCTTCTCG 
GCAGGCGCAG GCCGCGACCC 
GCGCCCCCGG CTCCTTCACA 
CCCGGGCGCT GGGCCGCGGG 
AGCACCGGGA CCGCGGAATG 
TGTACCTCTG CCCGGTGTAT 
GCCTGGACGA CGCCCGGCGC 
GCGCGTTGAC GCACTCCGGG 
AGACCTACTG GGGCCACCCG 
TGAAGGTGCC GGAGGCCAGC 
GGCCGCTGCG GGGGCGCCCG 
TTCCCTTCGG CGGGGGGTGG 
CCCCCCCCCC TTCTGCCCCT 
GCCGGGGCTT GTCGTTGTGT 
GAACTCCTTT TTTTTGAAAT 
CTGTGTGAAA CAACAACCGG 
CCCCCCTTTC CCTCCTCCTC 
TCCGCCTCCT CCTCCTCCGC 
TCCGCCTCCT CCTCCTCCGC 
CGCGCCGCGG GCGCCTGCGG 
GGCCCCGGGC CGCGCCGCGG 
CGCCCCGCCG CCTCCTTCTC 
AGAAGGCCCC CGCCCGGAGA 
AGACGGTGTC AGCAGCCCCG 
CTCAGGCGGG GCGCGGCGGC 
GCGCGGCGGG CGGCACGGCC 
AGTCTGGCTG TGGGTTCGTG 
CCCTCCCGGC CCGCGCCTCC 
GCGGGGGGTG CGGCGCCTCC 
CCGCGCCTCT GCCGCCCCCT 
CGACAGAGCG CCCCCCGCAG 
GGACGCGCGC GGGGGGCGGC 
GCTGCCGTCC CGGCGGGGGT 
TCGACCTCGA GGTGGCGCCA 
TTAAGCTACT GGGTCTAGGG 
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GGGGCGGGCG 


TGGCGGGCGG 


112260 


GGCGTAAGTG 


CGGGTGCATG 


112320 


TCCCCGGGGG 


CAGAGGGCGT 


112380 


TTCCCCCCTC 


TCTATCAAAG 


112440 


CCCCCCTCCT 


CCCTCCCCGA 


112500 


TGGCCGGCCA 


GACGTGCGTG 


112560 


ACTCTTTTTT 


TACCCGCCAG 


112620 


GGCCCCGGCT 


CTGTTCTATG 


112680 


TGCCCGCCAA 


GGAGACGCCC 


112740 


CTCTCTGCTC 


TTTGGAGGGA 


112800 


CGGTCCTTCT 


GCGCGGTGCG 


112860 


CCGGAGTGCG 


CGATGGACGG 


112920 


CACTTGTTCT 


GCCAGTGCCC 


112980 


CCCCGGATGC 


ACCAGGAGCA 


113040 


CGGGGGCGCA 


CCTCGGCGGC 


113100 


GGGCTGATGG 


GCTGCGGGGG 


113160 


TTGTACGAGA 


AACTGGTCCC 


113220 


GCGGTGGGCA 


CCAGAGCCCT 


113280 


GTGCCCCTCA 


TCCCCCTCGA 


113340 


CCTGGTGGCC 


GGCTCCGCGG 


113400 


CCTCTGTCCT 


TCACCCCCAC 


113460 


GTGGTCGTAA 


AACAATACCT 


113520 


ATAAATATTT 


TTAAAATGTT 


113580 


AAACTACGTC 


GAGGGGGCGC 


113640 


CTCCCCGCCC 


TCCGCCTCCT 


113700 


CTCCTCCTCC 


TCCGCCTCCT 


113760 


CGCCGCTGGC 


GCCGGACCCT 


113820 


CCCCGCTCGC 


CGGGCACCGG 


113880 


CGGGAGAACC 


GGGGGTGGGG 


113940 


CGCCTCCTGC 


TCCGGCGCCC 


114000 


GGAGGAGGCG 


CGTCCACAGG 


114060 


CGCGCCGCGC 


GGGGGCGCGC 


114120 


GGCGGGGGCA 


CCACAGACGC 


114180 


ACCTGCGCGT 


GGCGCGCGGG 


114240 


TCTCAGACCC 


GGCCCGTCCG 


114300 


CTCCTGGGCC 


CCAGGGGGCG 


114360 


CCCGCCGTCT 


GTTTCCTCTC 


114420 


CTCAGCGACT 


ACTGATACCC 


114480 


GACGGGAGGC 


GGCGGCGCCG 


114540 


CGGCGTCCCC 


CTTCTCTCCG 


114600 


ATGCGGTTCG 


CGCGTTAATT 


114660 


CCGCCGGCGA 


GATCTTGATC 


114720 


TGGGGGCGGG 


CGTTGCCCCG 


114780 
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CGGCGGCGAC GACGACGAGG CGCCCCGCGG 
CCTCCAAGGC GCCCAGCGGG GGCGTGGCGG 
CCCGCCCTGC ATCAGGCGAC GTCTCCCTCT 
GGGGGCGGGC TGGGGGCGGG TCACGGGCTG 
5 GGGCTGGGGG CGGGTCACGG GCTGGGGGCG 
AGCAGCAGGC ACGGCCCGGT GCCCCCCCAC 
TCGCTGTCCC TCGCGCCCCG GCAGGCGCCC 
GGCCTGGCGT GCCGGCGGAG CCGGAGGTGC 
CACGACTTGC TGGGCGACCG CCAGTGCGGG 

10 TCCCCTGTGC ACAACGCGTT GCCTTAGGTC 
TCCGTCCCTT TTGAGACCGT CGCGTCCCCG 
TGCAGCGGGG GGGGAGGGCG AAGGCGAAGG 
GGGACGCGCA GCCGCCCGCA CCCCGACGGG 
CACCGGCGCA GGTAGTCCGG GCGGAGCTTG 

15 CACCTCAGCG CCACTTCCAG CAGCAGTCTC 
GACGACCGCT CGGTGACGTA CAGCAACTCG 
AACTGTTTCT TTGCCCCCCC CTAAATCTCC 
CACAGGGGGC ACAGGGAGGG AGTGGGGCCG 
CCCCTCTTTC CCCAGGCATT GGTTTCCACC 

20 AGGGGGGGGG ACAGGGGGCG AGAGCCCGAG 
CCCAGCCCCC CGCCGCGTGC CGGGTGCCCC 
GCCGGGGCAG TTCGCAGGGG CGGGGGCTCG 
GGCCCACCAG GCCCCTTTTC CCCCCCGGAC 
CCCCAGACGG CGCCGCCGGC GAGCCCCGGC 

25 CCAAGGTACT CGGCCCCATC CCATCTGAGC 
CCACAATCAG AGACAGAGAG GCCCAGAGGA 
GTCACGGCCC CACGCTTACG CCGGGCTGGC 
TAGGTGAGGG GGTTTCCCCG CCGTAAATGC 
CCGCCTCCCT CCCTTAGGGG GGGAGAGCCC 

30 ATGAGGTTCT TGGGGTAATC GTACGCGGCG 
CGCGGGGCGG CCGCGCCGGG ACTCACCACG 
ATCCTTCGGG TCCCCTGGTT ATCCCCGGCT 
. AGACGCACAA GACGGTTCTT TCATTAGTCG 
GCCCCTCGCT CCCCGGCAGG CTTGCAAAAA 

35 TTTTCGACGA TTAATGGCGC TCGCCCTTGC 
TCGCCCCTAC CGCCGGCCCT GGCGGATAAT 
CGCGCCCCGC CAGCGGCCCC GCCTCAGGCC 
GGCGGGGGAC CCCGCCCGCC TCGCCGCCCC 
CCCGAGGGTA GCAGAGAAGC CTCTCGCCGG 

40 CGGCCGGCTC CGGCGGGAGC GGCCAAGTTG 
GCCGCCTCCT GGGCGCGCGG CGGCGGCGGC 
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TCCCCCGCGG 


CCAGCCCAGC 


GCCGCCCGAC 


114840 


CGGGGGCGCG 


GCCCCGCGAG 


AAGCCCCCCG 


114900 


GTCTCTGCCC 


TTGGGGGCCA 


ATCACGGGCT 


114960 


GGGGCGGGTC 


ACGGGCTGGG 


GGCGGGTCAC 


115020 


GGCGGGAGTG 


GCAGCCGGTC 


CAGTAGCAGG 


115080 


CCGCTGTCCC 


GCGCCTGGCA 


CACAGGGGGG 


115140 


AACGGGCAGG 


TCTATTTCAG 


GTGCCGGCAC 


115200 


GCCCAGGCCC 


CCAGCAAGTG 


ATAGCCCTAC 


115260 


TGATAGTCCA 


TGCGGTGGCC 


CCACAACGTG 


115320 


CAGAAGTACG 


TGCCCTACGT 


CTTCCCCACG 


115380 


CCCCGCTAGA 


GCAGGCACGT 


GTGCCGTGTG 


115440 


AGGAGTGGGT 


GCCCGGGTGG 


GGGCGTCCTA 


115500 


ACCGCGAGCC 


GGCCCCCGGC 


CCGGCCCCCG 


115560 


TAGAGGCACA 


GGCACGACGG 


GCGGAGCCTC 


115620 


TAAGGGTGGA 


GCCAGAGGAG 


GAGGCTCAGC 


115680 


TAGGGGGTCC 


GCACGCCCCG 


CCGCCCGACG 


115740 


CGCGCCCCGC 


ACTCCGCCCT 


GGGGGCACGG 


115800 


GGGGGCGGGC 


GACGAAAAAC 


AAGCCTTCCC 


115860 


AGACGCAGGA 


AACCTAAGGC 


TGGGGAGCAG 


115920 


TCCCGAGGGA 


CGGAGGGAGC 


GGGGGGGTTT 


115980 


CAGGGGGCTG 


GCGAATTCGC 


CCGGCCCCCA 


116040 


GGTGGCGGGC 


GCTGGTGGGG 


GTTGGGCGTC 


116100 


TCTGGGCCCC 


CAGCGGGAGA 


GTGGCACGGC 


116160 


CCCAGGCGGG 


CCCTCGAGCA 


CGGCCCGGCC 


116220 


TCTGCCGCCG 


GGCGCCAGAG 


AGAGAACGGC 


116280 


GGAGGGCGGC 


CCGGCGGCGA 


GGCAGCGAGC 


116340 


AGTGTGCCCC 


GACGGAATAT 


GGGCCGCGGA 


116400 


TAAGGGGGTT 


ATCGGCGCGC 


GGGGCCGCCC 


116460 


CGCCGGGGCA 


GGGGCCCCTG 


GTTGGCCCAC 


116520 


GGGGGCGGCT 


GCGTCTACCC 


TCAGGGGGGC 


116580 


GGCGGGGGCC 


CCTCTTTAAG 


TAATCGTATG 


116640 


AGTCGGGTGG 


GTGGGCCGCC 


GCGCGCTCCG 


116700 


TATTGGGCCT 


TGGGGCTCCC 


TCATTAATGC 


116760 


TTAATGGTAT 


TCGCCCTTAC 






GGCCGGGTAA 


TTTTCAACGA 


TTAATGGTAC 


116880 


TTTCAAAGAT 


TAATGGTATG 


GCCCTTCGGC 


116940 


CGGGCGCGCC 


GCCGCGCGCC 


AACCGGCCGC 


117000 


GCCGCGGCCC 


GGGAGCGCCT 


ATATATGCGC 


117060 


AGCGCGTCTG 


GAAGCCTCGA 


GGCCCCGAGG 


117120 


GGATCTGGCG 


GGCTGCCGGG 


CCCGGGCGCC 


117180 


GGA 






117213 



(2) INFORMATION FOR SEQ ID NO: 218: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO-.218: 



Met Arg Thr Pro Ala Asp Asp Val Ser Trp Arg Tyr Glu Ala Pro Ser 

1 .5 10 15 

Val lie Asp Tyr Ala Arg lie Asp Gly lie Phe Leu Arg Tyr His Cys 
15 20 25 30 

Pro Gly Leu Asp Thr Phe Leu Trp Asp Arg His Ala Gin Arg Ala Tyr 

35 40 45 

Leu Val Asn Pro Phe Leu Phe Ala Ala Gly Phe Leu Glu Asp Leu Ser 
50 55 60 

20 His Ser Val Phe Pro Ala Asp Thr Gin Glu Thr Thr Thr Arg Airg Ala 
65 70 75 80 

Leu Tyr Lys Glu lie Arg Asp Ala Leu Gly Ser Arg Lys Gin Ala Val 

85 90 95 

Ser His Ala Pro Val Arg Ala Gly Cys Val Asn Phe Asp Tyr Ser Arg 
25 100 105 110 

Thr Arg Arg Cys Val Gly Arg Arg Asp Leu Arg Pro Ala Asn Thr Thr 

115 120 125 

Ser Thr Trp Glu Pro Pro Val Ser Ser Asp Asp Glu Ala Ser Ser Gin 
130 135 140 

30 Ser Lys Pro Leu Ala Thr Gin Pro Pro Val Leu Ala Leu Ser Asn Ala 
145 150 155 160 

Pro Pro Arg Arg Val Ser Pro Thr Arg Gly Arg Arg Arg His Thr Arg 
165 170 175 

Leu Arg Arg Asn 
35 180 

(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

5 

Met Lys Arg Ala Arg Ser Arg Ser Pro Ser Pro Pro Ser Arg Pro Ser 

1 5 10 15 

Ser Pro Phe Arg Thr Pro Pro His Gly Gly Ser Pro Arg Arg Glu Val 
20 25 30 

10 Gly Ala Gly He Leu Ala Ser Asp Ala Thr Ser His Val Cys He Ala 
35 40 45 

Ser His Pro Gly Ser Gly Ala Gly Tyr Pro Thr Arg Leu Ala Ala Gly 

50 55 60 

Ser Ala Val Gin Arg Arg Arg Pro Arg Gly Cys Pro Pro Gly Val Met 
15 65 70 75 80 

Phe Ser Ala Ser Thr Thr Pro Glu Gin Pro Leu Gly Leu Ser Gly Asp 

85 90 95 

Ala Thr Pro Pro Leu Pro Thr Ser Val Pro Leu Asp Trp Ala Ala Phe 
100 105 110 

20 Arg Arg Ala Phe Leu He Asp Asp Ala Trp Arg Pro Leu Leu Glu Pro 
115 120 125 

Glu Leu Ala Asn Pro Leu Thr Ala Arg Leu Leu Ala Glu Tyr Asp Arg 

130 135 140 

Arg Cys Gin Thr Glu Glu Val Leu Pro Pro Arg Glu Asp Val Phe Ser 
25 145 150 155 160 

Trp Thr Arg Tyr Cys Thr Pro Asp Asp Val Arg Val Val He He Gly 

165 170 175 

Gin Asp Pro Tyr His His Pro Gly Gin Ala His Gly Leu Ala Phe Ser 
180 185 190 

30 Val Arg Ala Asp Val Pro Val Pro Pro Ser Leu Arg Asn Val Leu Ala 
195 200 205 

Ala Val Lys Asn Cys Tyr Pro Asp Ala Arg Met Ser Gly Arg Gly Cys 

210 215 220 

Leu Glu Lys Trp Ala Arg Asp Gly Val Leu Leu Leu Asn Thr Thr Leu 
35 225 230 235 240 

Thr Val Lys Arg Gly Ala Ala Ala Ser His Ser Lys Leu Gly Trp Asp 

245 250 255 

Arg Phe Val Gly Gly Val Val Arg Arg Leu Ala Ala Arg Arg Pro Gly 
260 265 270 

40 Leu Val Phe Met Leu Trp Gly Ala His Ala Gin Asn Ala He Arg Pro 
275 280 285 

Asp Pro Arg Gin His Tyr Val Leu Lys Phe Ser His Pro Ser Pro Leu 
290 295 300 
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5 



15 



Ser Lys Val Pro Phe Gly Thr Cys Gin His Phe Leu Ala Ala Asn Arg 

305 310 315 320 

Tyr Leu Glu Thr Arg Asp lie JMet Pro lie Asp Trp Ser Val 
325 330 

(2) INFORMATION FOR SEQ ID NO: 220: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 231 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 



Met Val Lys Ser Arg Val Ser Tyr Arg Ser Val Met Ser Gly Val Gly 
15 10 15 

20 Glu Glu Arg Val Pro Ser Ala Phe Thr lie Leu Ala Ser Trp Gly Trp 
20 25 30 

Thr Phe Ala Pro Gin Asn His Asp Pro Gly Asp Asn Thr Thr Pro lie 

35 40 45 

Glu Ser lie Ala Gly Thr Ala Pro Asp Ala His Val Gly Pro Leu Asp 
25 50 55 60 

Gly Glu Pro Asp Arg Asp Ala He Ser Pro Leu Thr Ser Ser Val Ala 
65 70 75 80 

* Gly Asp Pro Pro Gly Ala Asp Gly Pro Tyr Val Thr Phe Asp Thr Leu 
85 90 95 

30 Phe Met Val Ser Ser He Asp Glu Leu Gly Arg Arg Gin Leu Thr Asp 
100 105 110 

Thr He Arg Lys Asp Leu Arg Leu Ser Leu Ala Lys Phe Ser He Ala 

115 120 125 

Cys Thr Lys Thr Ser Ser Phe Ser Gly Thr Ala Ala Arg Gin Arg Lys 
35 130 135 140 

Arg Gly Ala Pro Pro Gin Arg Thr Cys Val Pro Arg Ser Asn Lys Ser 
145 150 155 160 

Leu Gin Met Phe Val Leu Cys Lys Arg Ala Asn Ala Ala Gin Val Arg 
165 170 175 

40 Glu Gin Leu Arg Ala Val He Arg Ser Arg Lys Pro Arg Lys Tyr Tyr 
180 185 190 

Thr Arg Ser Ser Asp Gly Arg Leu Cys Pro Ala Val Pro Val Phe Val 
195 200 205 
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His Glu Phe Val Ser Ser Glu Pro Met Arg Leu His Arg Asp Asn Val 

210 215 220 

Met Leu Ser Thr Glu Pro Asp.. 
225 230 

5 

(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 199 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 



Met Gly Asn Pro Gin Thr Thr He Ala Tyr Ser Leu His His Pro Arg 
1.5 10 15 

20 Ala Ser Leu Thr Ser Ala Leu Pro Asp Ala Ala Gin Val Val His Val 
20 25 30 

Phe Glu Ser Gly Thr Arg Ala Val Leu Thr Arg Gly Arg Ala Arg Gin 

35 40 45 

Asp Arg Leu Pro Arg Gly Gly Val Val He Gin His Thr Pro He Gly 
25 50 55 60 

Leu Leu Val He He Asp Cys Arg Ala Glu Phe Cys Ala Tyr Arg Phe 
65 70 75 80 

He Gly Arg Ala Ser Thr Gin Arg Leu Glu Arg Trp Trp Asp Ala His 
85 90 95 

30 Met Tyr Ala Tyr Pro Phe Asp Ser Trp Val Ser Ser Ser His Gly Glu 
100 105 110 

Ser Val Arg Ser Ala Thr Ala Gly He Leu Thr Val Val Trp Thr Pro 

115 120 125 

Asp Thr He Tyr He Thr Ala Thr He Tyr Gly Thr Ala Pro Glu Ala 
35 130 .135 140 

Arg Cys Asp Asn Ala Pro Leu Asp Val Arg Pro Thr Thr Pro Pro Ala 
145 150 155 160 

Pro Val Ser Pro Thr Ala Gly Glu Phe Pro Ala Asn Thr Thr Asp Leu 
165 170 175 

40 Leu Val Glu Val Leu Arg Glu lie Gin He Ser Pro Thr Leu Asp Asp 
180 185 190 

Ala Asp Pro Thr Pro Gly Thr 
195 
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(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 877 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 



Met Ala Ala Ser Gly Gly Glu Gly Ser Arg Asp Val Arg Ala Pro Gly 
15 1 5 10 15 

Pro Pro Pro Gin Gin Pro Gly Ala Arg Pro Ala Val Arg Phe Arg Asp 

20 25 30 

Glu Ala Phe Leu Asn Phe Thr Ser Met His Gly Val Gin Pro He He 
35 40 45 

20 Ala Arg He Arg Glu Leu Ser Gin Gin Gin Leu Asp Val Thr Gin Val 
50 55 60 

Pro Arg Leu Gin Trp Phe Arg Asp Val Ala Ala Leu Glu Val Pro Thr 
65 70 75 80 

Gly Leu Pro Leu Arg Glu Phe Pro Phe Ala Ala Tyr Leu He Thr Gly 
25 85 90 95 

Asn Ala Gly Ser Gly Lys Ser Thr Cys Val Gin Thr Leu Asn Glu Val 

100 105 110 

Leu Asp Cys Val Val Thr Gly Ala Thr Arg He Ala Ala Gin Asn Met 
115 120 125 

30 Tyr Val Lys Leu Ser Gly Ala Phe Leu Ser Arg Pro He Asn Thr He 
130 135 140 

Phe His Glu Phe Gly Phe Arg Gly Asn His Val Gin Ala Gin Leu Gly 
145 150 155 160 

Gin His Pro Tyr Thr Leu Ala Ser Ser Pro Ala Ser Leu Glu Asp Leu 
35 165 170 175 

Gin Arg Arg Asp Leu Thr Tyr Tyr Trp Glu Val He Leu Asp He Thr 

180 185 190 

Lys Arg Ala Ala His Gly Gly Glu Asp Ala Arg Asn Glu Phe His Ala 
195 200 205 

40 Leu Thr Ala Leu Glu Gin Thr Leu Gly Leu Gly Gin Gly Ala Leu Thr 
210 215 220 

Arg Leu Ala Ser Val Thr His Gly Ala Leu Pro Ala Phe Thr Arg Ser 
225 230 235 240 
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Asn. lie lie Val lie Asp Glu Ala Gly Leu Leu Gly Arg His Leu Leu 

245 250 255 

Thr Thr Val Val Tyr Cys Trp Trp Met lie Asn Ala Leu Tyr His Thr 
260 265 270 

5 Pro Gin Tyr Ala Gly Arg Leu Arg Pro Val Leu Val Cys Val Gly Ser 
275 280 285 

Pro Thr Gin Thr Ala Ser Leu Glu Ser Thr Phe Glu His Gin Lys Leu 

290 295 300 

Arg Cys Ser Val Arg Gin Ser Glu Asn Val Leu Thr Tyr Leu lie Cys 
10 305 310 315 320 

Asn Arg Thr Leu Arg Glu Tyr Thr Arg Leu Ser His Ser Trp Ala He 

325 330 335 

Phe He Asn Asn Lys Arg Cys Val Glu His Glu Phe Gly Asn Leu Met 
340 345 350 

15 Lys Val Leu Glu Tyr Gly Leu Pro He Thr Glu Glu His Met Gin Phe 
355 360 365 

Val Asp Arg Phe Val Val Pro Glu Ser Tyr He Thr Asn Pro Ala Asn 

370 375 380 

Leu Pro Gly Trp Thr Arg Leu Phe Ser Ser His Lys Glu Val Ser Ala 
20 385 390 395 400 

Tyr Met Ala Lys Leu His Ala Tyr Leu Lys Val Thr Arg Glu Gly Glu 

405 410 . 415 

Phe Val Val Phe Thr Leu Pro Val Leu Thr Phe Val Ser Val Lys Glu 
420 425 430 

25 Phe Asp Glu Tyr Arg Arg Leu Thr Gin Gin Pro Thr Leu Thr Met Glu . 
435 440 445 

Lys Trp He Thr Ala Asn Ala Ser Arg He Thr Asn Tyr Ser Gin Ser 

450 455 ' 460 

Gin Asp Gin Asp Ala Gly His Val Arg Cys Glu Val His Ser Lys Gin 
30 465 470 475 480 

Gin Leu Val Val Ala Arg Asn Asp He Thr Tyr Val Leu Asn Ser Gin 

485 490 495 

Val Ala Val Thr Ala Arg Leu Arg Lys Met Val Phe Gly Phe Asp Gly 
500 505 510 

35 Thr Phe Arg Thr Phe Glu Ala Val Leu Arg Asp Asp Ser Phe Val Lys 
515 520 525 

Thr Gin Gly Glu Thr Ser Val Glu Phe Ala Tyr Arg Phe Leu Ser Arg 

530 535 540 

Leu Met Phe Gly Gly Leu He His Phe Tyr Asn Phe Leu Gin Arg Pro 
40 545 550 555 560 

Gly Leu Asp Ala Thr Gin Arg Thr Leu Ala Tyr Gly Arg Leu Gly Glu 

565 570 575 

Leu Thr Ala Glu Leu Leu Ser Leu Arg Arg Asp Ala Ala Gly Ala Ser 
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580 585 590 

Ala Thr Arg Ala Ala Asp Thr Ser Asp Arg Ser Pro Gly Glu Arg Ala 

595 . 600 605 

Phe Asn Phe Lys His Leu Gly Pro Arg Asp Gly Gly Pro Asp Asp Phe 
5 610 615 620 

Pro Asp Asp Asp Leu Asp Val lie Phe Ala Gly Leu Asp Glu Gin Gin 
625 630 635 640 

Leu Asp Val Phe Tyr Cys His Tyr Ala Leu Glu Glu Pro Glu Thr Thr 
645 650 655 

10 Ala Ala Val His Ala Gin Phe Gly Leu Leu Lys Arg Ala Phe Leu Gly 
660 665 670 

Arg Tyr Leu lie Leu Arg Glu Leu Phe Gly Glu Val Phe Glu Ser Ala 

675 680 685 

Pro Phe Ser Thr Tyr Val Asp Asn Val lie Phe Arg Gly Cys Glu Leu 
15 690 695 700 

Leu Thr Gly Ser Pro Arg Gly Gly Leu Met Ser Val Gin Thr Asp Asn 
705 710 715 720 

Tyr Thr Leu Met Gly Tyr Thr Tyr Thr Arg Val Phe Ala Phe Ala Glu 
725 730 735 

20 . Glu Leu Arg Arg Arg His Ala Thr Ala Gly Val Ala Glu Phe Leu Glu 
740 745 750 

Glu Ser Pro Leu Pro Tyr lie Val Leu Arg Asp Gin His Gly Phe Met 

755 760 765 

Ser Val Val Asn Thr Asn He Ser Glu Phe Val Glu Ser He Asp Ser 
25 770 775 780 

Thr Glu Leu Ala Met Ala He Asn Ala Asp Tyr Gly He Ser Ser Lys 
785 790 795 800 

Leu Ala Met Thr He Thr Arg Ser Gin Gly Leu Ser Leu Asp Lys Val 
805 810 815 

30 Ala He Cys Phe Thr Pro Gly Asn Leu Arg Leu Asn Ser Ala Tyr Val 
820 825 830 

Ala Met Ser Arg Thr Thr Ser Ser Glu Phe Leu His Met Asn Leu Asn 

835 840 845 

Pro Leu Arg Glu Arg His Glu Arg Asp Asp Val He Ser Glu His He 
35 850 855 860 

Leu Ser Ala Leu Arg Asp Pro Asn Val Val lie Val Tyr 
865 870 875 



(2) INFORMATION FOR SEQ ID NO: 223: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 292 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

Met Ala Asp Pro Thr Pro Ala Asp Glu Gly Thr Ala Ala Ala He Leu 
15 10 15 

10 Lys Gin Ala He Ala Gly Asp Arg Ser Leu Val Glu Val Ala Glu Gly 
20 25 30 

He Ser Asn Gin Ala Leu Leu Arg Met Ala Cys Glu Val Arg Gin Val 

35 40 45 

Ser Asp Arg Gin Pro Arg Phe Thr Ala Thr Ser Val Leu Arg Val Asp 
15 50 55 60 

Val Thr Pro Arg Gly Arg Leu Arg Phe Val Leu Asp Gly Ser Ser Asp 
65 70 75 80 

Asp Ala Tyr Val Ala Ser Glu Asp Tyr Phe Lys Arg Cys Gly Asp Gin 
85 90 95 

20 Pro Tyr Gly Phe Ala Val Val Val Leu Thr Ala Asn Glu Asp His Val 
100 105 110 

His Ser Leu Ala Val Pro Pro Leu Val Leu Leu His Arg Leu Ser Leu 

115 120 125 

Phe Arg Pro Thr Asp Leu Arg Asp Phe Glu Leu Val Cys Leu Leu Met 
25 130 135 140 

Tyr Leu Glu Asn Cys Pro Arg Ser His Ala Thr Pro Ser Leu Phe Val 
145 150 155 160 

Lys Val Ser Ala Trp Leu Gly Val Val Ala Arg His Asp Phe Glu Arg 
165 170 175 

30 Val Arg Cys Leu Leu Leu Arg Ser Cys His Trp He Leu Asn Thr Leu 
180 185 190 

Met Cys Met Ala Gly Val Lys Pro Phe Asp Asp Glu Leu Val Leu Pro 

195 200 205 

His Trp Tyr Met Ala His Tyr Leu Leu Ala Asn Asn Pro Pro Pro Val 
35 210 215 220 

Leu Ser Ala Leu Phe Cys Ala Thr Pro Gin Ser Ser Ala Leu Gin Leu 
225 230 235 240 

Pro Gly Pro Val Pro Arg Thr Asp Cys Val Ala Tyr Asn Pro Ala Gly 
245 250 255 

40 Val Met Gly Ser Cys Trp Lys Ser Lys Asp Leu Arg Ser Ala Leu Val 
260 265 270 

Tyr Trp Trp Leu Ser Gly Ser Pro Lys Arg Arg Thr Ser Ser Leu Phe 
275 280 285 
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Tyx Arg Phe Cys 
290 

(2) INFORMATION FOR SEQ ID NO: 22 4: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 734 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:224: 

15 

Met Glu Ala Pro Gly He Val Trp Val Glu Glu Ser Val Ser Ala He 

15 10 15 

Thr Leu Tyr Ala Val Trp Leu Pro Pro Arg Thr Arg Asp Cys Leu His 
20 25 30 

20 Ala Leu Leu Tyr Leu Val Cys Arg Asp Ala Ala Gly Glu Ala Arg Ala 
35 40 45 

Arg Phe Ala Glu Val Ser Val Gly Ser Ser Asp Leu Gin Asp Phe Tyr 

50 55 60 

Gly Ser Pro Asp Val Ser Ala Ala Gly Ala Val Ala Ala Ala Arg Ala 
25 65 70 75 80 

Ala Pro Ala Asp Leu Glu Pro Leu Gly Asp Pro Thr Leu Trp Arg Ala 

85 90 95 

Leu Tyr Ala Cys Val Leu Ala Ala Leu Glu Arg Gin Thr Gly Pro Val 
100 105 110 

30 Phe Val Pro Leu Arg Leu Gly Trp Asp Pro Gin Thr Gly Leu Val Val 
115 120 125 

Arg Val Glu Arg Ala Ser Trp Gly Pro Pro Ala Ala Pro Arg Ala Ala 

130 135 140 

Leu Leu Asp Val Glu Ala Lys Val Asp Val Asp Pro Leu Ala Ala Arg 
35 145 150 155 160 

Val Ala Glu His Pro Gly Ala Arg Leu Ala Trp Ala Arg Leu Ala Ala 

165 170 175 

He Arg Asp Ser Pro Gin Cys Ala Ser Ser Ala Ser Leu Ala Val Thr 
180 185 190 

40 He Thr Thr Arg Thr Ala Arg Phe Ala Arg Glu Tyr Thr Thr Leu Ala 
195 200 205 

Phe Pro Pro Thr Ser Lys Glu Gly Ala Phe Ala Asp Leu Val Glu Val 
210 215 220 
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Cys Glu Val Gly Leu Arg Pro Arg Gly His Pro Gin Arg Val Thr Ala 
225 230 235 240 

Arg Val Leu Leu Pro Arg Gly. Tyr Asp Tyr Phe Val Ser Ala Gly Asp 
245 250 255 

5 Gly Phe Ser Ala Pro Ala Leu Val Phe Arg Gin Trp His Thr Thr Val 
260 265 270 

His Ala Ala Pro Gly Ala Pro Val Phe Ala Phe Leu Gly Pro Gly Phe 

275 280 285 

Glu Val Arg Gly Gly Pro Val Gin Tyr Phe Ala Val Leu Gly Phe Pro 
10 290 295 300 

Gly Trp Pro Thr Phe Thr Val Pro Ala Ala Ala Ala Ala Glu Ser Ala 
305 310 315 320 

Arg Asp Leu Val Arg Gly Ala Ala Ala Thr His Ala Ala Cys Leu Gly 
325 330 335 

15 Ala Trp Pro Ala Val Gly Ala Arg Val Val Leu Pro Pro Arg Ala Trp 
340 345 350 

Pro Ala Val Ala Ser Glu Ala Ala Gly Arg Leu Leu Pro Ala Phe Arg 

355 360 365 

Glu Ala Val Ala Arg Trp His Pro Thr Ala Thr Thr lie Gin Leu Leu 
20 370 375 380 

Asp Pro Pro Ala Ala Val Gly Pro Val Trp Thr Ala Arg Phe Cys Phe 
385 390 395 400 

Ser Gly Leu Gin Ala Gin Leu Leu Ala Ala Gly Leu Gly Glu Ala Gly 
405 410 415 

25 Leu Pro Glu Arg Arg Ala Gly Leu Glu Arg Leu Asp Ala Leu Val Ala 
420 425 430 

Ala Ala Pro Ser Glu Pro Trp Ala Arg Ala Val Leu Glu Arg Leu Val 

435 440 445 

Pro Asp Ala Cys Asp Ala Cys Pro Ala Leu Arg Gin Leu Leu Gly Gly 
30 450 455 460 

Val Met Ala Ala Val Cys Leu Gin He Glu Gin Thr Ala Ser Ser Val 
465 470 475 480 

Lys Phe Ala Val Cys Gly Gly Thr Gly Ala Ala Phe Trp Gly Leu Phe 
485 490 495 

35 Asn Val Asp Pro Gly Asp Ala Asp Ala Ala His Gly Ala He Gin Asp 
500 505 510 

Ala Arg Arg Ala Leu Glu Ala Ser Val Arg Ala Val Leu Ser Ala Asn 

515 520 525 

Gly He Arg Pro Arg Leu Ala Pro Ser Leu Ala Leu Glu Gly Val Tyr 
40 530 535 540 

Thr His Val Val Thr Trp Ser Gin Thr Gly Ala Trp Phe Trp Asn Ser 
545 550 555 560 

Arg Asp Asp Thr Asp Phe Leu Gin Gly Phe Pro Leu Arg Gly Pro Ala 
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565 570 575 

Tyr Ala Ala Ala Ala Glu Val Met Arg Asp Ala Leu Arg Arg lie Leu 

580 - 585 590 

Arg Arg Pro Ala Ala Gly Pro Pro Glu Glu Ala Val Cys Ala Arg He 
5 595 600 605 

Met Glu Asp Ala Cys Asp Arg Phe Val Leu Asp Ala Phe Gly Arg Arg 

610 615 620 

Leu Asp Ala Glu Tyr Trp Ser Val Leu Thr Pro Pro Gly Glu Ala Asp 
625 630 635 640 

10 Asp Pro Leu Pro Gin Thr Ala Phe Arg Gly Gly Ala Leu Leu Asp Ala 

645 650 655 

Glu Gin Tyr Trp Arg Arg Val Val Arg Val Cys Pro Gly Gly Gly Glu 

660 665 670 

Ser Val Gly Val Pro Val Asp Leu Tyr Pro Arg Pro Leu Val Leu Pro 
15 675 680 685 

Pro Val Asp Cys Ala His His Leu Arg Glu He Leu Arg Glu He Gin 

690 695 700 

Leu Val Phe Thr Gly Val Leu Glu Gly Val Trp Gly Glu Gly Gly Ser 
705 710 715 720 

20 Phe Val Tyr Pro Phe Glu Glu Lys Met Arg Phe Leu Phe Pro 

725 730 



(2) INFORMATION FOR SEQ ID NO: 225: 



25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 461 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:225: 



35 Met Gly Arg Arg Ala Pro Arg Gly Ser Pro Glu Ala Ala Pro Gly Ala 
1 5 10 15 

Asp Val Ala Pro Gly Ala Arg Ala Ala Trp Trp Val Trp Cys Val Gin 

20 25 30 

Val Ala Thr Phe He Val Ser Ala He Cys Val Val Gly Leu Leu Val 
40 35 40 45 

Leu Ala Ser Val Phe Arg Asp Arg Phe Pro Cys Leu Tyr Ala Pro Ala 

50 55 60 

Thr Ser Tyr Ala Glu Ala Asn Ala Thr Val Glu Val Arg Gly Gly Val 
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65 70 75 80 

Ala Val Pro Leu Arg Leu Asp Thr Gin Ser Leu Leu Ala Thr Tyr Ala 

85 - 90 95 

lie Thr Ser Thr Leu Leu Leu Ala Ala Ala Val Tyr Ala Ala Val Gly 
5 100 105 110 

Ala Val Thr Ser Arg Tyr Glu Arg Ala Leu Asp Ala Ala Arg Arg Leu 

115 120 125 

Ala Ala Ala Arg Met Ala Met Pro His Ala Thr Leu lie Ala Gly Asn 
130 135 140 

10 Val Cys Ala Trp Leu Leu Gin lie Thr Val Leu Leu Leu Ala His Arg 
145 . 150 155 160 

He Ser Gin Leu Ala His Leu He Tyr Val Leu His Phe Ala Cys Leu 

165 170 175 

Val Tyr Leu Ala Ala His Phe Cys Thr Arg Gly Val Leu Ser Gly Thr 
15 180 185 190 

Tyr Leu Arg Gin Val His Gly Leu He Asp Pro Ala Pro Thr His His 

195 200 205 

Arg lie Val Gly Pro Val Arg Ala Val Met Thr Asn Ala Leu Leu Leu 
210 215 220 

20 Gly Thr Leu Leu Cys Thr Ala Ala Ala Ala Val Ser Leu Asn Thr He 
225 230 235 240 

Ala Ala Leu Asn Phe Asn Phe Ser Ala Pro Ser Met Leu He Cys Leu 

245 250 255 

Thr Thr Leu Phe Ala Leu Leu Val Val Ser Leu Leu Leu Val Val Glu 
25 260 265 270 

Gly Val Leu Cys His Tyr Val Arg Val Leu Val Gly Pro His Leu Gly 

275 280 285 

Ala lie Ala Ala Thr Gly He Val Gly Leu Ala Cys Glu His Tyr His 
290 295 300 

30 Thr Gly Gly Tyr Tyr Val Val Glu Gin Gin Trp Pro Gly Ala Gin Thr 
305 310 315 320 

Gly Val Arg Val Val Ala Ala Phe Ala Met Ala Val Leu Arg Cys Thr 

325 330 335 

Arg Ala Tyr Leu Tyr His Arg Arg His His Thr Lys Phe Phe Val Arg 
35 340 345 350 

Met Arg Asp Thr Arg His Arg Ala His Ser Ala Leu Arg Arg Val Arg 

355 360 365 

Ser Ser Met Arg Gly Ser Arg Arg Gly Gly Pro Pro Gly Asp Pro Gly 
370 375 380 

40 Tyr Ala Glu Thr Pro Tyr Ala Ser Val Ser His His Ala Glu lie Asp 
385 390 395 400 

Arg Tyr Gly Asp Ser Asp Gly Asp Pro lie Tyr Asp Glu Val Ala Pro 
405 410 415 
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Asp His Glu Ala Glu Leu Tyr Ala Arg Val Gin Arg Pro Gly Pro Val 

420 425 430 

Pro Asp Ala Glu Pro lie Tyr. Asp Thr Val Glu Gly Tyr Ala Pro Arg 
435 440 445 

5 Ser Ala Gly Glu Pro Val Tyr Ser Thr Val Arg Arg Trp 
450 455 460 

(2) INFORMATION FOR SEQ ID NO: 226: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226: 

20 Met Gly Leu Ala Phe Ser Gly Ala Arg Pro Cys Cys Cys Arg His Asn 
15 10 15 

Val He lie Thr Asp Gly Gly Glu Val Val Ser Leu Thr Ala His Glu 

20 25 30 

Phe Asp Val Val Asp He Glu Ser Glu Glu Glu Gly Asn Phe Tyr Val 
25 35 40 45 

Pro Pro Asp Met Arg Val Val Thr Arg Ala Pro Gly Pro Gin Tyr Arg 

50 55 60 

Arg Ala Ser Asp Pro Pro 'Ser Arg His Thr Arg Arg Arg Asp Pro Asp 
65 70 75 80 

30 Val Ala Arg Pro Pro Ala Thr Leu Thr Pro Pro Leu Ser Asp Ser Glu 

85 90 95 

(2) INFORMATION FOR SEQ ID NO: 227: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 618 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 
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Met Ala Ala Ala Ala Thr Pro Gly Ala Lys Arg Pro Ala Asp Pro Ala 

1 5 10 15 

Arg Asp Pro Asp Ser Pro Pro Lys Arg Pro Arg Pro Asn Ser Leu Asp 
5 20 25 30 

Leu Ala Thr Val Phe Gly Pro Arg Pro Ala Pro Pro Arg Pro Thr Ser 

35 40 45 

Pro Gly Ala Pro Gly Ser His Trp Pro Gin Ser Pro Pro Arg Gly Gin 
50 55 60 

10 Pro Asp Gly Gly Ala Pro Gly Glu Lys Ala Arg Pro Asp Ala Leu Ser 
65 70 75 80 

Glu Ala Ser Ser Gly Pro Pro Thr Pro Asp lie Pro Leu Ser Pro Gly 

85 90 95 

Gly Ala His Ala lie Asp Pro Asp Cys Ser Pro Gly Pro Pro Asp Pro 
15 100 105 110 

Asp Pro Met Trp Ser Ala Ser Ala lie Pro Asn Ala Leu Pro Pro His 

115 120 125 

lie Leu Ala Glu Thr Phe Glu Arg His Leu Arg Gly Leu Leu Arg Gly 
130 135 140 

20 Val Arg Ser Pro Leu Ala lie Gly Pro Leu Trp Ala Arg Leu Asp Tyr 
145 150 155 160 

Leu Cys Ser Leu Val Val Ser Leu Glu Ala Ala Gly Met Val Asp Arg 

165 170 175 

Gly Leu Gly Arg His Leu Trp Arg Leu Thr Arg Arg Ala Pro Pro Ser 
25 180 185 190 

Ala Ala Glu Ala Val Ala Pro Arg Pro Leu Met Gly Phe Tyr Glu Ala 

195 200 205 

Ala Thr Gin Asn Gin Ala Asp Cys Gin Leu Trp Ala Leu Leu Arg Arg 
210 215 220 

30 Gly Leu Thr Thr Ala Ser Thr Leu Arg Trp Gly Ala Gin Gly Pro Cys 
225 230 235 240 

Phe Ser Ser Gin Trp Leu Thr His Asn Ala Ser Leu Arg Leu Asp Ala 

245 250 255 

Gin Ser Ser Ala Val Met Phe Gly Arg Val Asn Glu Pro Thr Ala Arg 
35 260 265 270 

Asn Leu Leu Phe Arg Tyr Cys Val Gly Arg Ala Asp Ala Gly Val Asn 

275 280 285 

Asp Asp Ala Asp Ala Gly Arg Phe Val Phe His Gin Pro Gly Asp Leu 
290 295 300 

40 Ala Glu Glu Asn Val His Ala Cys Gly Val Leu Met Asp Gly His Thr 
305 310 315 320 

Gly Met Val Gly Ala Ser Leu Asp He Leu Val Cys Pro Arg Asp Pro 
325 330 335 
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His Gly Tyr Leu Ala Pro Ala Pro Gin Thr Pro Leu Ala Phe Tyr Glu 

340 345 350 

Val Lys Cys Arg Ala Lys Tyr. Ala Phe Asp Pro Ala Asp Pro Gly Ala 
355 360 365 

5 Pro Ala Ala Ser Ala Tyr Glu Asp Leu Met Ala Arg Arg Ser Pro Glu 
370 375 380 

Ala Phe Arg Ala Phe lie Arg Ser lie Pro Asn Pro Gly Val Arg Tyr 
385 390 395 400 

Phe Ala Pro Gly Arg Val Pro Gly Pro Glu Glu Ala Leu Val Thr Gin 
10 405 410 415 

Asp Arg Asp Trp Leu Asp Ser Arg Ala Ala Gly Glu Lys Arg Arg Cys 

420 425 430 

Ser Ala Pro Asp Arg Ala Leu Val Glu Leu Asn Ser Gly Val Val Ser 
435 440 445 

15 Glu Val Leu Leu Phe Gly Val Pro Asp Leu Glu Arg Arg Thr lie Ser 
450 455 460 

Pro Val Ala Trp Ser Ser Gly Glu Leu Val Arg Arg Glu Pro lie Phe 
465 470 475 480 

Ala Asn Pro Arg His Pro Asn Phe Lys Gin lie Leu Val Gin Gly Tyr 
20 485 490 495 

Val Leu Asp Ser His Phe Pro Asp Cys Pro Leu Gin Pro His Leu Val 

500 505 510 

Thr Phe Leu Gly Arg His Arg Ala Gly Ala Glu Glu Gly Val Thr Phe 
515 520 525 

25 Arg Leu Glu Asp Gly Arg Gly Ala Pro Ala Gly Arg Gly Gly Ala Pro 
530 535 540 

Gly Pro Ala Lys Ala Ser lie Leu Pro Asp Gin Ala Val Pro lie Ala 
545 550 555 560 

Leu lie lie Thr Pro Val Arg Val Glu Pro Gly lie Tyr Arg Asp lie 
30 565 570 575 

Arg Arg Asn Ser Arg Leu Ala Phe Asp Asp Thr Leu Ala Lys Leu Trp 

580 585 590 

Ala Ser Arg Ser Pro Gly Arg Gly Pro Ala Ala Ala Asp Thr Thr Ser 
595 600 605 

35 Ser Ser Pro Thr Ala Gly Arg Ser Ser Arg 
610 615 



(2) INFORMATION FOR SEQ ID NO: 228: 



40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 

Met Asp Glu Ser Gly Arg Gin Arg Pro Ala Ser His Val Ala Ala Asp 

1 5 10 15 

lie Ser Pro Gin Gly Ala His Arg Arg Ser Phe Lys Ala Trp Leu Ala 
10 20 25 30 

Ser Tyr lie His Ser Leu Ser Arg Arg Ala Ser Gly Arg Pro Ser Gly 

35 40 45 

Pro Ser Pro Arg Asp Gly Ala Val Ser Gly Ala Arg Pro Gly Ser Arg 
.50 55 60 

15 Arg Arg Ser Ser Phe Arg Glu Arg Leu Arg Ala Gly Leu Ser Arg Trp 
65 70 75 80 

Arg Val Ser Arg Ser Ser Arg Arg Arg Ser Ser Pro Glu Ala Pro Gly 

85 90 95 

Pro Ala Ala Lys Leu Arg Arg Pro Pro Leu Arg Arg Ser Glu Thr Ala 
20 100 105 110 

Met Thr Ser Pro Pro Ser Pro Pro Ser His lie Leu Ser Leu Ala Arg 

115 120 125 

lie His Lys Leu Cys He Pro Val Phe Ala Val Asn Pro Ala Leu Arg 
130 135 140 

25 Tyr Thr Thr Leu Glu He Pro. Gly Ala Arg Ser Phe Gly Gly Ser Gly 
145 150 155 160 

Gly Tyr Gly Glu Val Gin Leu He Arg Glu His Lys Leu Ala Val Lys 

165 170 175 

Thr He Arg Glu Lys Glu Trp Phe Ala Val Glu Leu Val Ala Thr Leu 
30 180 185 190 

Leu Val Gly Glu Cys Ala Leu Arg Gly Gly Arg Thr His Asp lie Arg 

195 200 205 

Gly Phe He Thr Pro Leu Gly Phe Ser Leu Gin Gin Arg Gin He Val 
210 215 220 

35 Phe Pro Ala Tyr Asp Met Asp Leu Gly Lys Tyr He Gly Gin Leu Ala 
225 230 235 240 

Ser Leu Arg Ala Thr Thr Pro Ser Val Ala Thr Ala Leu His His Cys 

245 250 255 

Phe Thr Asp Leu Ala Arg Ala Val Val Phe Leu Asn Thr Arg Cys Gly 
40 260 265 270 

He Ser His Leu Asp He Lys Cys Ala Asn Val Leu Val Met Leu Arg 

275 280 285 

Ser Asp Ala Val Ser Leu Arg Arg Ala Val Leu Ala Asp Phe Ser Leu 
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290 295 300 

Val Thr Leu Asn Ser Asn Ser Thr lie Ser Arg Gly Gin Phe Cys Leu 
305 310 315 320 

Gin Glu Pro Asp Leu Glu Ser Pro Arg Gly Phe Gly Met Pro Ala Ala 
5 325 330 335 

Leu Thr Thr Ala Asn Phe His Thr Leu Val Gly His Gly Tyr Asn Gin 

340 345 350 

Pro Pro Glu Leu Leu Val Lys Tyr Leu Asn Asn Glu Arg Ala Glu Phe 
355 360 365 

10 Asn Asn Arg Pro Leu Lys His Asp Val Gly Leu Ala Val Asp Leu Tyr 
370 375 380 

Ala Leu Gly Gin Thr Leu Leu Glu Leu Leu Val Ser Val Tyr Val Ala 
385 390 395 400 

Pro Ser Leu Gly Val Pro Val Thr Arg Val Pro Gly Tyr Gin Tyr Phe 
15 405 410 415 

Asn Asn Gin Leu Ser Pro Asp Phe Ala Val Leu Ala Tyr Arg Cys Val 

420 425 430 

Leu His Pro Ala Leu Phe Val Asn Ser Ala Glu Thr Asn Thr His Gly 
435 440 445 

20 Leu Ala Tyr Asp Val Pro Glu Gly lie Arg Arg His Leu Arg Asn Pro 
450 455 460 

Lys He Arg Arg Ala Phe Thr Glu Gin Cys He Asn Tyr Gin Arg Thr 
465 470 475 480 

His Lys Ala Val Leu Ser Ser Val Ser Leu Pro Pro Glu Leu Arg Pro 
25 485 490 495 

Leu Leu Val Leu Val Ser Arg Leu Cys His Ala Asn Pro Ala Ala Arg 

500 505 510 

His Ser Leu Ser 
515 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 229: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 217 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 



Met Ser Arg Asp Ala Ser His Ala Ala Leu Arg Arg Arg Leu Ala Glu 
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15 10 15 • 

Thr His Leu Arg Ala Glu Val Tyr Arg Asp Gin Thr Leu Gin Leu His 

20 25 30 

Arg Glu Gly Val Ser Thr Gin Asp. Pro Arg Phe Val Gly Ala Phe Met 
5 35 40 45 

Ala Ala Lys Ala Ala His Leu Glu Leu Glu Ala Arg Leu Lys Ser Arg 

50 55 60 

Ala Arg Leu Glu Met Met Arg Gin Arg Ala Thr Cys Val Lys lie Arg 
65 70 75 • 80 

10 Val Glu Glu Gin Ala Ala Arg Arg Asp Phe Leu Thr Ala His Arg Arg 

85 90 95 

Tyr Leu Asp Pro Ala Leu Ser Leu Asp Ala Ala Asp Asp Arg Leu Ala 

100 105 110 

Asp Gin Glu Glu Gin Leu Glu Glu Ala Ala Ala Asn Ala Ser Leu Trp 
15 115 120 125 

Gly Asp Gly Asp Leu Ala Asp Gly Trp Met Ser Pro Gly Asp Ser Asp 

130 135 140 

Leu Leu Val Met Trp Gin Leu Thr Ser Ala Pro Lys Val His Thr Asp 
145 150 155 160 

20 Ala Pro Ser Arg Pro Gly Ser Arg Pro Thr Tyr Thr Pro Ser Ala Ala 

165 170 175 

Gly Arg Pro Asp Ala Gin Ala Ala Pro Pro Pro Glu Thr Ala Pro Ser 

180 185 190 

Pro Glu Pro Ala Pro Gly Pro Ala Ala Asp Pro Ala Ser Gly Ser Gly 
25 195 200 205 

Phe Ala Arg Asp Cys Pro Asp Gly Glu 
210 215 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 230: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 430 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 



Met Phe Gly Gin Gin Leu Ala Ser Asp Val Gin Gin Tyr Leu Glu Arg 

1 5 10 15 

Leu Glu Lys Gin Arg Gin Gin Lys Val Gly Val Asp Glu Ala Ser Ala 
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20 25 30 

Gly Leu Thr Leu Gly Gly Asp Ala Leu Arg Val Pro Phe Leu Asp Phe 

35 i.40 45 

Ala Thr Ala Thr Pro Lys Arg His Gin Thr Val Val Pro Gly Val Gly 
5 50 55 60 

Thr Leu His Asp Cys Cys Glu His Ser Pro Leu Phe Ser Ala Val Ala 
65 70 75 80 

Arg Arg Leu Leu Phe Asn Ser Leu Val Pro Ala Gin Leu Arg Gly Arg 
85 90 , 95 

10 Asp Phe Gly Gly Asp His Thr Ala Lys Leu Glu Phe Leu Ala Pro Glu 
100 105 110 

Leu Val Arg Ala Val Ala Arg Leu Arg Phe Arg Glu Cys Ala Pro Glu 

115 120 125 

Asp Ala Val Pro Gin Arg Asn Ala Tyr Tyr Ser Val Leu Asn Thr Phe 
15 130 135 140 

Gin Ala Leu His Arg Ser Glu Ala Phe Arg Gin Leu Val His Phe Val 
145 150 155 160 

Arg Asp Phe Ala Gin Leu Leu Lys Thr Ser Phe Arg Ala Ser Ser Leu 
165 170 175 

20 Ala Glu Thr Thr Gly Pro Pro Lys Lys Arg Ala Lys Val Asp Val Ala 
180 185 190 

Thr His Gly Gin Thr Tyr Gly Thr Leu Glu Leu Phe Gin Lys Met lie 

195 200 205 

Leu Met His Ala Thr Tyr Phe Leu Ala Ala Val Leu Leu Gly Asp His 
25 210 215 220 

Ala Glu Gin Val Asn Thr Phe Leu Arg Leu Val Phe Glu lie Pro Leu 
225 230 235 240 

Phe Ser Asp Thr Ala Val Arg His Phe Arg Gin Arg Ala Thr Val Phe 
245 250 255 

30 Leu Val Pro Arg Arg His Gly Lys Thr Trp Phe Leu Val Pro Leu He 
260 265 270 

Ala Leu Ser Leu Ala Ser Phe Arg Gly lie Lys He Gly Tyr Thr Ala 

275 280 285 

His He Arg Lys Ala Thr Glu Pro Val Phe Asp Glu He Asp Ala Cys 
35 290 295 300 

Leu Arg Gly Trp Phe Gly Ser Ser Arg Val Asp His Val Lys Gly Glu 
305 310 315 320 

Thr He Ser Phe Ser Phe Pro Asp Gly Ser Arg Ser Thr He Val Phe 
325 330 335 

40 Ala Ser Ser His Asn Thr Asn Val Ser Thr Pro Ser Ser Arg Gly Ala 
340 345' 350 

Cys Phe Pro Gly Ala Ala Leu Pro Glu He Asp Arg Gin Thr Asn Thr 
355 360 365 
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Ala Arg Arg Glu Cys Gly Thr Trp Gin Pro Pro Pro Pro Trp Arg Gly 

370 375 380 

Glu Ala Leu Leu Phe lie Cys Asn Arg Thr Met Arg Leu Trp Pro Arg 
385 390 395 400 

5 Pro Ala Arg Pro Arg Gly Ser Ser Leu Gin Thr Gly Gly Trp Tyr Thr 

405 410 415 

Met Thr Glu Arg Arg Gly Ala Thr Arg Arg Trp Ser Gly Gly 
420 425 430 

10 (2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 <xi) SEQUENCE DESCRIPTION: SEQ ID NO:231: 

Val Trp Arg Val Val Arg Gly Asp Glu Arg Leu Lys lie Phe Arg Cys 

15 10 15 

Leu Thr Val Leu Thr Glu Pro Leu Cys Gin Val Pro Asp Pro Asp Pro 
25 20 25 30 

Glu Arg Ala Leu Phe Cys Glu lie Phe Leu Tyr Leu Trp Lys Ala Leu 

35 40 45 

Arg Leu Pro Ser Asn Thr Phe Phe Ala lie Phe Phe Phe Asn Arg Glu 
50 55 60 

30 Arg Arg Tyr Cys Ala Thr Val His Leu Arg Ser Val Thr His Pro Arg 
65 70 75 80 

Thr Pro Leu Leu Cys Thr Leu Ala Phe Gly His Leu Glu Ala Asp Pro 

85 90 95 

Glu Glu Thr Pro Asp Pro Ala Ala Glu Gin Leu Ala Asp Glu Pro Val 
35 100 105 110 

Ala His Glu Leu Asp Gly Ala Tyr Leu Val Pro Thr Glu Pro Pro Pro 

115 120 125 

Asn Pro Gly Ala Cys Cys Ala Leu Gly Pro Gly Ala Trp Trp His Leu 
130 135 140 

40 Pro Gly Gly Arg He Tyr Cys Trp Ala Met Asp Asp Asp Leu Gly Ser 
145 150 155 160 

Leu Cys Pro Pro Gly Ser Arg Ala Arg His Leu Gly Trp Leu Leu Ser 
165 170 175 
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Arg lie Thr Asp Pro Pro Gly Gly Gly Gly Ala Cys Ala Pro Thr Ala 

180 185 190 

His lie Asp Ser Ala Asn Ala- Leu Trp Arg Ala Pro Ala Val Ala Glu 
195 200 205 

5 Ala Cys Pro Cys Val Ala Pro Cys Met Trp Ser Asn Met Ala Gin Arg 
210 215 220 

Thr Leu Ala Val Arg Gly Asp Ala Ser Leu Cys Gin Leu Leu Phe Gly 
225 230 235 240 

His Pro Val Asp Ala Val lie Leu Arg Gin Ala Thr Arg Arg Pro Arg 
10 245 250 255 

He Thr Ala His Leu His Glu Val Val Val Gly Arg Asp Gly Ala Glu 

260 265 270 

Ser Val He Arg Pro Thr Ser Ala Gly Trp Arg Leu Cys Val Leu Ser 
275 280 285 

15 Ser Tyr Thr Ser Arg Leu Phe Ala Thr Ser Cys Pro Ala Val Ala Arg 
290 295 300 

Ala Val Ala Arg Ala Ser Ser Ser Asp Tyr Lys 
305 310 315 

20 (2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 698 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 

Met Asn Ala His Phe Ala Asn Glu Val Gin Tyr Asp Leu Thr Arg Asp 

15 10 15 

Pro Ser Ser Pro Ala Ser Leu He His Val He He Ser Ser Glu Cys 
35 20 25 30 

Leu Ala Ala Ala Gly Val Pro Leu Ser Ala Leu Val Arg Gly Arg Pro 

35 40 45 

Asp Gly Gly Ala Ala Ala Asn Phe Arg Val Glu Thr Gin Thr Arg Ala 
50 55 60 

40 His Ala Thr Gly Asp Cys Thr Pro Trp Arg Ser Ala Phe Ala Ala Tyr 
65 70 75 80 

Val Pro Ala Asp Ala Val Gly Ala He Leu Ala Pro Val He Pro Ala 
85 90 95 
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His Pro Asp Leu Leu Pro Arg Val Pro Ser Ala Gly Gly Leu Phe Val 

100 105 110 

Ser Leu Pro Val Ala Cys Asp- Ala Gin Gly Val Tyr Asp Pro Tyr Thr 
115 120 125 

5 Val Ala Ala Leu Arg Leu Ala Trp Gly Pro Trp Ala Thr Cys Ala Arg 
130 135 140 

Val Leu Leu Phe Ser Tyr Asp Glu Leu Val Pro Pro Asn Thr Arg Tyr 
145 150 155 160 

Ala Ala Asp Gly Ala Arg Leu Met Arg Leu Cys Arg His Phe Cys Arg 
10 165 170 175 

Tyr Val Ala Arg Leu Gly Ala Ala Ala Pro Ala Ala Ala Thr Glu Ala 

180 185 190 

Ala Ala His Leu Ser Leu Gly Met Gly Glu Ser Gly Thr Pro Thr Pro 
195 200 205 

15 Gin Ala Ser Ser Val Ser Gly Gly Ala Gly Pro Ala Val Val Gly Thr 
210 215 220 

Pro Asp Pro Pro lie Ser Pro Glu Glu Gin Leu Thr Ala Pro Gly Gly 
225 230 235 240 

Asp Thr Ala Thr Ala Glu Asp Val Ser He Thr Gin Glu Asn Glu Glu 
20 245 250 255 

He Leu Ala Leu Val Gin Arg Ala Val Gin Asp Val Thr Arg Arg His 

260 265 270 

Pro Val Arg Ala Arg Pro Lys His Ala Ala Ser Gly Val Ala Ser Gly 
275 280 285 

25 Leu Arg Gin Gly Ala Leu Val His Gin Ala Val Ser Gly Gly Ala Leu 
290 295 300 

Gly Ala Ser Asp Ala Glu Ala Val Leu Ala Gly Leu Glu Pro Pro Gly 
305 310 315 320 

Gly Gly Arg Phe Ala Thr Pro Gly Gly Pro Arg Ala Ala Gly Glu Asp 
30 325 330 335 

Val Leu Asn Asp Val Leu Thr Leu Val Pro Gly Thr Ala Lys Pro Arg 

340 345 350 

Ser Leu Val Glu Trp Leu Asp Arg Gly Trp Glu Ala Gly Gly Asp Arg 
355 360 . 365 

35 Pro Asp Trp Leu Trp Ser Arg Arg Ser He Ser Val Val Leu Arg His 
370 375 380 

His Tyr Gly Thr Lys Gin Arg Phe Val Val Val Ser Tyr Glu Asn Ser 
385 390 395 400 

Val Ala Trp Gly Gly Arg Arg Ala Arg Pro Pro Arg Leu Ser Ser Glu 
40 405 410 415 

Leu Ala Thr Ala Leu Thr Glu Ala Cys Ala Ala Glu Arg Val Val Arg 

420 425 430 

Pro His Gin Leu Ser Pro Ala Ala Gin Thr Ala Leu Leu Arg Arg Phe 
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435 440 445 

Pro Ala Leu Glu Gly Pro Leu Arg His Pro Arg Pro Val Leu Gin Pro 

450 455 - 460 

Phe Asp He Ala Ala Glu Val Ala Phe Val Ala Arg He Gin He Ala 
5 465 470 475 480 

Cys Leu Arg Ala Leu Gly His Ser He Arg Ala Ala Leu Gin Gly Gly 

485 490 495 

Pro Arg He Phe Gin Arg Leu Arg Tyr Asp Phe Gly Pro His Gin Ser 
500 505 510 

10 Glu Trp Leu Gly Glu Val Thr Arg Arg Phe Pro Val Leu Leu Glu Asn 
515 520 525 

Leu Met Arg Ala Leu Glu Gly Thr Ala Pro Asp Ala Phe Phe His Thr 

530 535 540 

Ala Tyr Ala Val Leu Ala His Leu Gly Gly Gin Gly Gly Arg Gly Arg 
15 545 550 555 560 

Arg Arg Arg Leu Val Pro Leu Ser Asp Asp He Pro Ala Arg Phe Ala 

565 570 575 

Asp Ser Asp Ala His Tyr Ala Phe Asp Tyr Tyr Ser Thr Ser Gly Asp 
580- 585 590 

20 Thr Leu Arg Leu Thr Asn Arg Pro He Ala Val Val He Asp Gly Asp 
595 600 605 

Val Asn Gly Arg Glu Gin Ser Lys Cys Arg Phe Met Glu Gly Ser Pro 

610 615 620 

Ser Thr Ala Pro His Arg Val Cys Glu Gin Tyr Leu Pro Gly Glu Ser 
25 625 630 635 640 

Tyr Ala Tyr Leu Cys Leu Gly Phe Asn Arg Arg Leu Cys Gly Leu Val 

645 650 655 

Val Phe Pro Gly Gly Phe Ala Phe Thr He Asn Thr Ala Ala Tyr Leu 
660 665 670 

30 Ser Leu Ala Asp Pro Val Ala Arg Ala Val Gly Leu Arg Phe Cys Arg 
675 680 685 

Gly Ala Ala Thr Gly Pro Gly Leu Val Arg 
690 695 

35 (2) INFORMATION FOR SEQ ID NO:233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 423 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:233: 

Val Pro Glu Gly Ala Trp Val Gly Gly Ala Cys Ala Arg Pro Arg Gly 
5 1 5 10 15 

Pro Arg Ala His Val Arg Leu Tyr Ala Val Cys Phe Val Cys Pro Gin 

20 25 30 

Gly He Arg Gly Gin Asp Phe Asn Leu Leu Phe Val Asp Glu Ala Asn 
35 40 45 

10 Phe He Arg Pro Asp Ala Val Gin Thr He Met Gly Phe Leu Asn Gin 
50 55 60 

Ala Asn Cys Lys He He Phe Val Ser Ser Thr Asn Thr Gly Lys Ala 
65 70 75 80 

Ser Thr Ser Phe Leu Tyr Asn Leu Arg Gly Ala Ala Asp Glu Leu Leu 
15 85 90 95 

Asn Val Val Thr Tyr lie Cys Asp Asp His Met Pro Arg Val Val Thr 

100 105 110 

His Thr Asn Ala Thr Ala Cys Ser Cys Tyr He Leu Asn Lys Pro Val 
115 120 125 

20 Phe He Thr Met Asp Gly Ala Val Arg Arg Thr Ala Asp Leu Phe Leu 
130 135 140 

Pro Asp Ser Phe Met Gin Glu He He Gly Gly Gin Ala Arg Glu Thr 
145 150 155 160 

Gly Asp Asp Arg Pro Val Leu Thr Lys Ser Ala Gly Glu Arg Phe Leu 
25 165 170 175 

Leu Tyr Arg Pro Ser Thr Thr Thr Asn Ser Gly Leu Met Ala Pro Glu 

180 185 . 190 

Leu Tyr Val Tyr Val Asp Pro Ala Phe Thr Ala Asn Thr Arg. Ala Ser 
195 200 205 

30 Gly Thr Gly He Ala Val Val Gly Arg Tyr Arg Asp Asp Phe He He 
210 215 220 

Phe Ala Leu Glu His Phe Phe Leu Arg Ala Leu Thr Gly Ser Ala Pro 
225 230 235 240 

Ala Asp He Ala Arg Cys Val Val His Ser Leu Ala Gin Val Leu Ala 
35 245 250 255 

Leu His Pro Gly Ala Phe Arg Ser Val Arg Val Ala Val Glu Gly Asn 

260 265 270 

Ser Ser Gin Asp Ser Ala Val Ala He Ala Thr His Val His Thr Glu 
275 280 285 

40 Met His Arg He Leu Ala Ser Ala Gly Ala Asn Gly Pro Gly Pro Glu 
290 295 300 

Leu Leu Phe Tyr His Cys Glu Pro Pro Gly Gly Ala Val Leu Tyr Pro 
305 310 315 320 
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Phe Phe Leu Leu Asn Lys Gin Lys Thr Pro Ala Phe Glu Tyr Phe lie 

325 330 335 

Lys Lys Phe Asn Ser Gly Gly Val Met Ala Ser Gin Glu Leu Val Ser 
340 345 350 

5 Val Thr Val Arg Leu Gin Thr Asp Pro Val Glu Tyr Leu Ser Glu Gin 
355 360 365 

Leu Asn Asn Leu lie Glu Thr Val Ser Pro Asn Thr Asp Val Arg Met 

370 375 380 

Tyr Ser Gly Lys Arg Asn Gly Ala Ala Asp Asp Leu Met Val Ala Val 
10 385 390 395 400 

He Met Ala He Tyr Leu Ala Ala Pro Thr Gly He Pro Pro Ala Phe 

405 410 415 

Phe Pro He Thr Arg Thr Ser 
420 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 234: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 312 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NQ:234: 



Met He Thr Asp Cys Phe Glu Ala Asp He Ala He Pro Ser Gly He 
1 5 10 15 

30 Ser Arg Pro Asp Ala Ala Ala Leu Gin Arg Cys Glu Gly Arg Val Val 
20 25 30 

Phe Leu Pro Thr He Arg Arg Gin Leu Ala Asp Val Ala His Glu Ser 

35 40 45 

Phe Val Ser Gly Gly Val Ser Pro Asp Thr Leu Gly Leu Leu Leu Ala 
35 50 55 60 

Tyr Arg Arg Arg Phe Pro Ala Val He Thr Arg Val Leu Pro Thr Arg 
65 70 75 80 

He Val Ala Cys Pro Val Asp Leu Gly Leu Thr His Ala Gly Thr Val 
85 90 95 

40 Asn Leu Arg Asn Thr Ser Pro Val Asp Leu Cys Asn Gly Asp Pro Val 
100 105 HO 

Ser Leu Val Pro Pro Val Phe Glu Gly Gin Ala Thr Asp Val Arg Leu 
115 120 125 
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Glu Ser Leu Asp Leu Thr Leu Arg Phe Pro Val Pro Leu Pro Thr Pro 

130 135 140 

Leu Ala Arg Glu lie Val Ala -Arg Leu Val Arg He Arg Asp Leu Asn 
145 150 155 160 

5 Pro Asp Pro Arg Thr Pro Gly Glu Leu Pro Asp Leu Asn Val Leu Tyr 

165 170 175 

Tyr Asn Gly Ala Arg Leu Ser Leu Val Ala Asp Val Gin Gin Leu Ala 

180 185 190 

Ser Val Asn Thr Glu Leu Arg Ser Leu Val Leu Asn Met Val Tyr Ser 
10 195 200 205 

He Thr Glu Gly Thr Thr Leu He Leu Thr Leu He Pro Arg Leu Leu 

210 215 220 

Ala Leu Ser Ala Gin Asp Gly Tyr Val Asn Ala Leu Leu Gin Met Gin 
225 230 235 240 

15 Ser Val Thr Arg Glu Ala Ala Gin Leu He His Pro Glu Ala Pro Met 

245 250 255 

Leu Met Gin Asp Gly Glu Arg Arg Leu Pro Leu Tyr Glu Ala Leu Val 

260 265 270 

Ala Trp Leu Ala His Ala Gly Gin Leu Gly Asp lie Leu Ala Pro Ala 
20 275 280 285 

Val Arg Val Cys Thr Phe Asp Gly Ala Ala Val Val Gin Ser Gly Asp 

290 295 300 

Met Ala Pro Val He Arg Tyr Pro 
305 310 



25 



35 



(2) INFORMATION FOR SEQ ID NO: 235: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 222 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 



Met Thr Met Arg Asp Asp Val Pro Leu Leu Asp Arg Glu Leu Val Tyr 
15 10 15 

40 Glu Ala Ala Cys Gly Gly Glu Asp Gly Glu Leu Pro Leu Asp Glu Gin 
20 25 30 

Phe Ser Leu Ser Ser Tyr Gly Thr Ser Asp Phe Phe Val Ser Ser Ala 
35 40 45 
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Tyr Ser Arg Leu Pro Pro His Thr Gin Pro Val Phe Ser Lys Arg Val 

50 55 60 

Val Met Phe Ala Trp Ser Phe -Leu Val Leu Lys Pro Leu Glu Leu Val 
65 70 75 80 

5 Ala Ala Gly Met Tyr Tyr Gly Trp Thr Gly Arg Ala Val Ala Pro Ala 

85 90 95 

Cys He He Ala Ala Val Leu Ala Tyr Tyr Val Thr Trp Leu Ala Arg 

100 105 110 

Ala Leu Leu Leu Tyr Val Asn He Lys Arg Asp Arg Leu Pro Leu Ser 
10 115 120 125 

Pro Pro Val Phe Trp Gly Leu Cys Val He Met Gly Gly Ala Ala Leu 

130 135 140 

Cys Ala Leu Val Ala Ala Ala His Glu Thr Phe Ser Pro Asp Gly Leu 
145 150 155 160 

15 Phe His Trp He Thr Ala Ser Gin Leu Leu Pro Arg Thr Asp Pro Leu 

165 170 175 

Arg Ala Arg Ser Leu Gly He Ala Cys Ala Ala Gly Ala Ala Met Trp 

180 185 190 

Val Ala Ala Ala Asp Cys Phe Ala Ala Phe Thr Asn Phe Phe Leu Ala 
20 195 200 205 

Arg Phe Trp Thr Arg Ala He Leu Lys Ala Pro Val Ala Phe 
210 215 220 



25 



(2) INFORMATION FOR SEQ ID NO: 236: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 824 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:236: 

35 

Met Gly Pro Gly Leu Trp Val Val Met Gly Val Leu Val Gly Val Ala 

15 10 15 

Gly Gly His Asp Thr Tyr Trp Thr Glu Gin lie Asp Pro Trp Phe Leu 
20 25 30 

40 His Gly Leu Gly Leu Ala Arg Thr Tyr Trp Arg Asp Thr Asn Thr Gly 
35 40 45 

Arg Leu Trp Leu Pro Asn Thr Pro Asp Ala Ser Asp Pro Gin Arg Gly 
50 55 60 
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Arg Leu Ala Pro Pro Gly Glu Leu Asn Leu Thr Thr Ala Ser Val Pro 
65 70 75 80 

Met Leu Arg Trp Tyr Ala Glu Arg Phe Cys Phe Val Leu Val Thr Thr 
85 90 95 

5 Ala Glu Phe Pro Arg Asp Pro Gly Gin Leu Leu Tyr He Pro Lys Thr 
100 105 110 

Tyr Leu Leu Gly Arg Pro Arg Asn Ala Ser Leu Pro Glu Leu Pro Glu 

115 120 125 

Ala Gly Pro Thr Ser Arg Pro Pro Ala Glu Val Thr Gin Leu Lys Gly 
10 130 135 140 

Leu Ser His Asn Pro Gly Ala Ser Ala Leu Leu Arg Ser Arg Ala Trp 
145 150 155 160 

Val Thr Phe Ala Ala Ala Pro Asp Arg Glu Gly Leu Thr Phe Pro Arg 
165 170 175 

15 Gly Asp Asp Gly Ala Thr Glu Arg His Pro Asp Gly Arg Arg Asn Ala 
180 185 190 

Pro Pro Pro Gly Pro Pro Ala Gly Thr Pro Arg His Pro Thr Thr Asn 

195 200 205 

Leu Ser He Ala His Leu His Asn Ala Ser Val Thr Trp Leu Ala Arg 
20 210 215 220 

Leu Leu Arg Thr Pro Gly Arg Tyr Val Tyr Leu Ser Pro Ser Ala Ser 
225 230 235 240 

Thr Trp Pro Val Gly Val Trp Thr Thr Gly Gly Leu Ala Phe Gly Cys 
245 250 255 

25 Asp Ala Ala Leu Val Arg Ala Arg Tyr Gly Lys Gly Phe Met Gly Leu 
260 265 270 

Val He Ser Met Arg Asp Ser Pro Pro Ala Glu He He Val Val Pro 

275 280 285 

Ala Asp Lys Thr Leu Ala Arg Val Gly Asn Pro Thr Asp Glu Asn Ala 
30 290 295 300 

Pro Ala Val Leu Pro Gly Pro Pro Ala Gly Pro Arg Tyr Arg Val Phe 
305 310 315 320 

Val Leu Gly Ala Pro Thr Pro Ala Asp Asn Gly Ser Ala Leu Asp Ala 
325 330 335 

35 Leu Arg Arg Val Ala Gly Tyr Pro Glu Glu Ser Thr Asn Tyr Ala Gin 
340 345 350 

Tyr Met Ser Arg Ala Tyr Ala Glu Phe Leu Gly Glu Asp Pro Gly Ser 

355 360 365 

Gly Thr Asp Ala Arg Pro Ser Leu Phe Trp Arg Leu Ala Gly Leu Leu 
40 370 375 . 380 

Ala Ser Ser Gly Phe Ala Phe Val Asn Ala Ala His Ala His Asp Ala 
385 390 395 400 

He Arg Leu Ser Asp Leu Leu Gly Phe Leu Ala His Ser Arg Val Leu 
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405 410 415 

Ala Gly Leu Ala Arg Ala Ala Gly Cys Ala Ala Asp Ser Val Phe Leu 

420 425 430 

Asn Val Ser Val Leu Asp Pro Ala Ala Arg Leu Arg Leu Glu Ala Arg 
5 435 440 445 

Leu Gly His Leu Val Ala Ala He Arg Glu Gin Ser Leu Ala Ala His 

450 455 460 

Ala Leu Gly Tyr Gin Leu Ala Phe Val Leu Asp Ser Pro Ala Ala Tyr 
465 470 475 480 

10 Gly Ala Val Ala Pro Ser Ala Ala Arg Leu He Asp Ala Leu Tyr Ala 

485 490 495 

Glu Phe Leu Gly Gly Arg Ala Leu Thr Ala Pro Met Val Arg Arg Ala 

500 505 510 

Leu Phe Tyr Ala Thr Ala Val Leu Arg Ala Pro Phe Leu Ala Gly Ala 
15 515 520 525 

Pro Ser Ala Glu Gin Arg Glu Arg Ala Arg Arg Gly Leu Leu He Thr 

530 535 540 

Thr Ala Leu Cys Thr Ser Asp Val Ala Ala Ala Thr His Ala Asp Leu 
545 550 555 560 

20 Arg Ala Ala Arg Thr Asp His Gin Lys Asn Leu Phe Trp Leu Pro Asp 

565 570 575 

His Phe Ser Pro Cys Ala Ala Ser Leu Arg Phe Asp Leu Ala Glu Gly 

580 585 590 

Gly Phe He Leu Asp Ala Met Ala Thr Arg Ser Asp He Pro Ala Asp 
25 595 600 605 

Val Met Ala Gin Gin Thr Arg Gly Val Ala Ser Val Leu Thr Arg Trp 

610 615 620 

Ala His Tyr Asn Ala Leu He Arg Ala Phe Val Pro Glu Ala Thr His 
625 630 635 640 

30 Gin Cys Ser Gly Pro Ser His Asn Ala Glu Pro Arg He Leu Val Pro 

645 650 655 

He Thr His Asn Ala Ser Tyr Val Val Thr His Thr Pro Leu Pro Arg 

660 665 670 

Gly lie Gly Tyr Lys Leu Thr Gly Val Asp Val Arg Arg Pro Leu Phe 
35 675 680 685 

He Thr Tyr Leu Thr Ala Thr Cys Glu Gly His Ala Arg Glu He Glu 

690 695 700 

Pro Lys Arg Leu Val Arg Thr Glu Asn Arg Arg Asp Leu Gly Leu Val 
705 710 715 720 

40 Gly Ala Val Phe Leu Arg Tyr Thr Pro Ala Gly Glu Val Met Ser Val 

725 730 735 

Leu Leu Val Asp Thr Asp Ala Thr Gin Gin Gin Leu Ala Gin Gly Pro 
740 745 750 
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Val Ala Gly Thr Pro Asn Val Phe Ser Ser Asp Val Pro Ser Val Leu 

755 760 765 

Leu Phe Pro Asn Gly Thr Val lie His Leu Leu Ala Phe Asp Thr Leu 
770 775* 780 

5 Pro lie Ala Thr He Ala Pro Gly Phe Leu Ala Ala Ser Ala Leu Gly 
785 790 . 795 800 

Val Val Met He Thr Ala Ala Gly He Leu Arg Val Val Arg Thr Cys 

805 810 815 

Val Pro Phe Leu Trp Arg Arg Glu 
10 820 

(2) INFORMATION FOR SEQ ID NO: 237: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 370 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:237: 

Met Ala Ser His Ala Gly Gin Gin His Ala Pro Ala Phe Gly Gin Ala 
25 1 5 10 15 

Ala Arg Ala Ser Gly Pro Thr Asp Gly Arg Ala Ala Ser Arg Pro Ser 

20 25 30 

His Arg Gin Gly Ala Ser Asp- Pro Glu Leu Pro Thr Leu Leu Arg Val 
35 40 45 

30 Tyr He Asp Gly Pro His Gly Val Gly Lys Thr Thr Thr Ser Ala Gin 
50 55 60 

Leu Met Glu Ala Leu Gly Pro Arg Asp Asn He Val Tyr Val Pro Glu 
65 70 75 80 

Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr Leu Thr Asn 
35 85 90 95 

He Tyr Asn Thr Gin His Arg Leu Asp Arg Gly Glu He Ser Ala Gly 

100 105 HO 

Glu Ala Ala Val Val Met Thr Ser Ala Gin lie Thr Met Ser Thr Pro 
115 120 125 

40 Tyr Ala Ala Thr Asp Ala Val Leu Ala Pro His He Gly Gly Glu Ala 
130 135 140 

Val Gly Pro Gin Ala Pro Pro Pro Ala Leu Thr Leu Val Phe Asp Arg 
145 150 155 160 
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His Pro lie Ala Ser Leu Leu Cys Tyr Pro Ala Ala Arg Tyr Leu Met. 

165 170 175 

Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Met Pro Pro Thr 
180 185 190 

5 Ala Pro Gly Thr Asn Leu Val Leu Gly Val Leu Pro Glu Ala Glu His 
195 200 205 

Ala Asp Arg Leu Ala Arg Arg Gin Arg Pro Gly Glu Arg Leu Asp Leu 

210 215 220 

Ala Met Leu Ser Ala lie Arg Arg Val Tyr Asp Leu Leu Ala Asn Thr 
10 225 230 235 240 

Val Arg Tyr Leu Gin Arg Gly Gly Arg Trp Arg Glu Asp Trp Gly Arg 

245 250 255 

Leu Thr Gly Val Ala Ala Ala Thr Pro Arg Pro Asp Pro Glu Asp Gly 
260 265 * 270 

15 Ala Gly Ser Leu Pro Arg lie Glu Asp Thr Leu Phe Ala Leu Phe Arg 
275 280 285 

Val Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu. Tyr His He Phe Ala 

290 295 300 

Trp Val Leu Asp Val Leu Ala Asp Arg Leu Leu Pro Met His Leu Phe 
20 305 310 315 320 

Val Leu Asp Tyr Asp Gin Ser Pro Val Gly Cys Arg Asp Ala Leu Leu 

325 330 335 . 

Arg Leu Thr Ala Gly Met He Pro Thr Arg Val Thr Thr Ala Gly Ser 
340 345 350 

25 He Ala Glu He Arg Asp Leu Ala Arg Thr Phe Ala Arg Glu Val Gly 
. 355 360 365 

Gly Val 
370 

30 (2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 8: 



Met Ala Arg Thr Gly Arg Arg Ala Ala Val Gly Arg Pro Ala Arg Thr 
15 10 15 
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Ser Ser Leu Thr Glu Arg Arg Arg Val Leu Leu Ala Gly Val Arg Ser 

20 25 30 

His Thr Arg Phe Tyr Lys Ala Phe Ala Arg Glu Val Arg Glu Phe Asn 
35 40 45 

5 Ala Thr Arg lie Cys Gly Thr Leu Leu Thr Leu Met Ser Gly Ser Leu 
50 55 60 

Gin Gly Arg Ser Leu Phe Glu Ala Thr Arg Val Thr Leu lie Cys Glu 
65 70 75 80 

Val Asp Leu Gly Pro Arg Arg Pro Asp Cys lie Cys Val Phe Glu Phe 
10 85 90 95 

Ala Asn Asp Lys Thr Leu Gly Gly Val Cys Val lie Leu Lys Thr Cys 

100 105 110 

Lys Ser He Ser Ser Gly Asp Thr Ala Ser Lys Arg Glu Gin Arg Thr 
115 120 125 

15 Thr Gly Met Lys Gin Leu Arg His Ser Leu Lys Leu Leu Gin Ser Leu 
130 135 140 

Ala Pro Pro Gly Asp Lys Val Val Tyr Leu Cys Pro He Leu Val Phe 
145 150 155 160 

Val Ala Gin Arg Thr Leu Arg Val Ser Arg Val Thr Arg Leu Val Pro 
20 165 170 175 

Gin Lys He Ser Gly Asn He Thr Ala Ala Val Arg Met Leu Gin Ser 

180 185 190 

Leu Ser Thr Tyr Ala Val Pro Pro Glu Pro Gin Thr Arg Arg Ser Arg 
195 200 205 

25 Arg Arg Val Ala Ala Thr Ala Arg Pro Gin Arg Pro Pro Ser Pro Thr 
210 215 220 

Arg Asp Pro Glu Gly Thr Ala Gly His Pro Ala Pro Pro Glu Ser Asp 
225 230 235 240 

Pro Pro Ser Pro Gly Val Val Gly Val Ala Ala Glu Gly Gly Gly Val 
30 245 250 255 

Leu Gin Lys He Ala Ala Leu Phe Cys Val Pro Val Ala Ala Lys Ser 

260 265 270 

Arg Pro Arg Thr Lys Thr Glu 
275 



35 



(2) INFORMATION FOR SEQ ID NO: 239: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 571 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:239: 

5 Met Asp Pro Tyr Tyr Pro Phe Asp Ala Leu Asp Val Trp Glu His Arg 
15 10 15 

Arg Phe lie Val Ala Asp Ser Arg Ser Phe lie Thr Pro Glu Phe Pro 

20 25 30 

Arg Asp Phe Trp Met Leu Pro Val Phe Asn lie Pro Arg Glu Thr Ala 
10 35 40 45 

Ala Glu Arg Ala Ala Val Leu Gin Ala Gin Arg Thr Ala Ala Ala Ala 

50 55 60 

Ala Leu Glu Asn Ala Ala Leu Gin Ala Ala Glu Leu Pro Val Asp lie 
65 • 70 75 80 

15 Glu Arg Arg lie Arg Pro He Glu Gin Gin Val His His He Ala Asp 

85 90 95 

Ala Leu Glu Ala Leu Glu Thr Ala Ala Ala Ala Ala Glu Glu Ala Asp 

100 105 110 

Ala Ala Arg Asp Ala Glu Arg Glu Gly Ala Ala Asp Gly Ala Ala Pro 
20 115 120 125 

Ser Pro Thr Ala Gly Pro Ala Ala Ala Glu Met Glu Val Gin He Val 

130 135 140 

Arg Asn Asp Pro Pro Leu Arg Tyr Asp Thr Asn Leu Pro Val Asp Leu 
145 150 155 160 

25 Leu His Met Val Tyr Ala Gly Arg Gly Ala Ala Gly Ser Ser Gly Val 

165 170 175 

Val Phe Gly Thr Trp Tyr Arg Thr He Gin Glu Arg Thr He Ala Asp 

180 185 190 

Phe Pro Leu Thr Thr Arg Ser Ala Asp Phe Arg Asp Gly Arg Met Ser 
30 195 200 205 

Lys Thr Phe Met Thr Ala Leu Val Leu Ser Leu Gin Ser Cys Gly Arg 

210 215 220 

Leu Tyr Val Gly Gin Arg His Tyr Ser Ala Phe Glu Cys Ala Val Leu 
225 230 235 240" 

35 Cys Leu Tyr Leu Leu Tyr Arg Thr Thr His Glu Ser Ser Pro Asp Arg 

245 250 255 

Asp Arg Ala Pro Val Ala Phe Gly Asp Leu Leu Ala Arg Leu Pro Arg 

260 265 270 

Tyr Leu Ala Arg Leu Ala Ala Val He Gly Asp Glu Ser Gly Arg Pro 
40 275 280 285 

Gin Tyr Arg Tyr Arg Asp Asp Lys Leu Pro Lys Ala Gin Phe Ala Ala 

290 295 300 

Ala Gly Gly Arg Tyr Glu His Gly Ala Thr His Val Val He Ala Thr 
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305 310 315 320 

Leu Val Arg His Gly Val Leu Pro Ala Ala Pro Gly Asp Val Pro Arg 

325 330 335 

Asp Thr Ser Thr Arg Val Asn Pro Asp Asp Val Ala His Arg Asp Asp 
5 340 345 350 

Val Asn Arg Ala Ala Ala Ala Phe Leu Arg His Asn Leu Phe Leu Trp 

355 360 365 

Glu Asp Gin Thr Leu Leu Arg Ala Thr Ala Asn Thr He Thr Ala Val 
370 375 380 

10 Leu Arg Arg Leu Leu Ala Asn Gly Asn Val- Tyr Ala Asp Arg Leu Asp 
385 390 395 400 

Asn Arg Leu Gin Leu Gly Met Leu He Pro Gly Ala Val Pro Ala Glu 

405 410 415 

Ala He Arg Ala Ser Gly Leu Asp Ser Gly Ala He Lys Ser Gly Asp 
15 420 425 430 

Asn Asn Leu Glu Ala Leu Cys Val Asn Tyr Val Leu Pro Leu Tyr Gin 

435 440 445 

Ala Asp Pro Thr Val Glu Leu Thr Gin Leu Phe Pro Gly Leu Ala Ala 
450 455 460 

20 Leu Cys Leu Asp Ala Gin Ala Gly Arg Pro Leu Ala Ser Thr Arg Arg 
465 470 475 480 

Val Val Asp Met Ser Ser Gly Ala Arg Gin Ala Ala Leu Val Arg Leu 

485 490 495 

Thr Ala Leu Glu Leu He Asn Arg Thr Arg Thr Asn Thr Thr Pro Val 
25 500 505 510 

Gly Glu He lie Asn Ala His Asp Ala Leu Gly He Gin Tyr Glu Gin 

515 520 525 

Gly Leu Gly Leu Leu Ala Gin Gin Ala Arg He Gin Ala Lys Arg Phe 
530 535 540 

30 Ala Thr Phe Asn Val Gly Ser Asp Tyr Asp Leu Leu Tyr Phe Leu Cys 
545 550 555 560 

Leu Gly Phe He Pro Gin Tyr Leu Ser Val Ala 
565 570 

35 (2) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:240: 

Met Ala Ser Ala Glu Met Arg Glu Arg Leu Glu Ala Pro Leu Pro Asp 
5 1 5 10 15 

Arg Ala Val Pro lie Tyr Val Ala Gly Phe Leu Ala Leu Tyr Asp Ser 

20 25 30 

Gly Asp Pro Gly Glu Leu Ala Leu Asp Pro Asp Thr Val Arg Ala Ala 
35 40 45 

10 Leu Pro Pro Glu Asn Pro Leu Pro lie Asn Val Asp His Arg Ala Arg 
50 55 60 

Cys Glu Val Gly Arg Val Leu Ala Val Val Asn Asp Pro Arg Gly Pro 
65 70 75 80 

Phe Phe Val Gly Leu lie Ala Cys Val Gin Leu Glu Arg Val Leu Glu 
15 85 90 95 

Thr Ala Ala Ser Ala Ala lie Phe Glu Arg Arg Gly Pro Ala Leu Ser 

100 105 110 

Arg Glu Glu Arg Leu Leu Tyr Leu lie Thr Asn Tyr Leu Pro Ser Val 
115 120 125 

20 Ser Leu Ser Thr Lys Arg Arg Gly Asp Glu Val Pro Pro Asp Arg Thr 
130 135 140 

Leu Phe Ala His Val Cys Ala lie Gly Arg Arg Leu Gly Thr He Val 
145 150 155 160 

Thr Tyr Asp Thr Ser Leu Asp Ala Ala He Ala Pro Phe Arg His Leu 
25 165 170 175 

Asp Pro Ala Thr Arg Glu Gly Val Arg Arg Glu Ala Ala Glu Ala Glu 

180 185 190 

Leu Ala Gly Arg Thr Trp Ala Pro Gly Val Glu Ala Leu Thr His Thr 
195 200 205 

30 Leu Leu Ser Thr Ala Val Asn Asn Met Met Leu Arg Asp Arg Trp Ser 
210 215 220 

Leu Val Ala Glu Arg Arg Arg Gin Ala Gly He Ala Gly His Thr Tyr 
225 230 235 240 

Leu Gin Ala Ser Glu Lys Phe Lys He Trp Gly Ala Glu Ser Ala Pro 
35 245 250 255 

Ala Pro Glu Arg Gly Tyr Lys Thr Gly Ala Pro Gly Ala Met Asp Thr 

260 265 270 

Ser Pro Ala Ala Ser Val Pro Ala Pro Gin Val Ala Val Arg Ala Arg 
275 280 285 

40 Gin Val Ala Ser Ser Ser Ser Ser Ser Ser Ser Phe Pro Ala Pro Ala 
290 295 300 

Asp Met Asn Pro Val Ser Ala Ser Gly Ala Pro Ala Pro Pro Pro Pro 
305 310 315 320 
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Gly Asp Gly Ser Tyr Leu Trp lie Pro Ala Phe His Tyr Asn Gin Leu 

325 330 335 

Val Thr Gly Gin Ser Ala Pro His His Pro Pro Leu Thr Ala Cys Gly 
340 345 350 

5 Leu Pro Ala Ala Gly Thr Val Ala Tyr Gly His Pro Gly Ala Gly Pro 
355 360 365 

Ser Pro His Tyr Pro Pro Pro Pro Ala His Pro Tyr Pro Gly Met Leu 

370 375 380 

Phe Ala Gly Pro Ser Pro Leu Glu Ala Gin He Ala Ala Leu Val Gly 
10 385 390 395 400 

Ala He Ala Ala Asp Arg Gin Ala Gly Gly Leu Pro Ala Ala Ala Gly 

405 410 415 

Asp His Gly He Arg Gly Ser Ala Lys Arg Arg Arg His Glu Val Glu 
420 425 430 

15 Gin Pro Glu Tyr Asp Cys Gly Arg Asp Glu Pro Asp Arg Asp Phe Pro 
435 440 445 

Tyr Tyr Pro Gly Glu Ala Arg Pro Glu Pro Arg Pro Val Asp Ser Arg 

450 455 460 

Arg Ala Ala Arg Gin Ala Ser Gly Phe Thr He Thr Ala Leu Val Gly 
20 465 470 475 480 

Ala Val Thr Ser Leu Gin Gin Glu Leu Ala His Met Arg Ala Arg Thr 

485 490 495 

His Ala Pro Tyr Gly Pro Tyr Pro Pro Val Gly Pro Tyr His His Pro 
500 505 510 

25 His Ala Asp Thr Glu Thr Pro Ala Gin Pro Pro Arg Tyr Pro Ala Glu 
515 520 525 

Ala Val Tyr Leu Pro Pro Pro His lie Ala Pro Pro Gly Pro Pro Leu 

530 535 540 

Ser Gly Ala Val Pro Pro Pro Ser Tyr Pro Pro Val Ala Val Thr Pro 
30 545 550 555 560 

Gly Pro Ala Pro Pro Leu His Gin Pro Ser Pro Ala His Ala His Pro 

565 570 575 

Pro Pro Pro Pro Pro Gly Pro Thr Pro Pro Pro Ala Ala Ser Leu Pro 
580 585 590 

35 Gin Pro Glu Ala Pro Gly Ala Glu Ala Gly Ala Leu Val Asn Ala Ser 
595 600 605 

Ser Ala Ala His Val Lys Arg Gly His Gly Pro Gly Arg Arg Ser Val 

610 615 620 

Cys Val Thr Asp Asp Gly Val Pro Leu Thr Arg Leu Gin Asp Pro Asp 
40 625 630 635 640 

Leu Gly Gly Val Cys Val Phe He Tyr Phe Lys 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 896 amino acids 
5 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:241: 



Met Arg Gly Gly Gly Leu lie Cys Ala Leu Val Val Gly Ala Leu Val 
1 5 10 15 

15 Ala Ala Val Ala Ser Ala Ala Pro Ala Ala Pro Ala Ala Pro Arg Ala 
20 25 30 

Ser Gly Gly Val Ala Ala Thr Val Ala Ala Asn Gly Gly Pro Ala Ser 

35 40 45 

Arg Pro Pro Pro Val Pro Ser Pro Ala Thr Thr Lys Ala Arg Lys Arg 
20 50 55 60 

Lys Thr Lys Lys Pro Pro Lys Arg Pro Glu Ala Thr Pro Pro Pro Asp 
65 70 75 80 

Ala Asn Ala Thr Val Ala Ala Gly His Ala Thr Leu Arg Ala His Leu 
85 90 95 

25 Arg Glu lie Lys Val Glu Asn Ala Asp Ala Gin Phe Tyr Val Cys Pro 
100 105 110 

Pro Pro Thr Gly Ala Thr Val Val Gin Phe Glu Gin Pro Arg Arg Cys 

• 115 120 125 

Pro Trp Glu Gly Gin Asn Tyr Thr Glu Gly lie Ala Val Val Phe Lys 
30 130 135 140 

Glu Asn lie Ala Pro Tyr Lys Phe Lys Ala Thr Met Tyr Tyr Lys Asp 
145 150 155 160 

Val Thr Val Ser Gin Val Trp Phe Gly His Arg Tyr Ser Gin Phe Met 
165 170 175 

35 Gly lie Phe Glu Asp Arg Ala Pro Val Pro Phe Glu Glu Val He Asp 
180 185 190 

Lys He Asn Ala Lys Gly Val Cys Arg Ser Thr Ala Lys Tyr Val Arg 

195 200 205 

Asn Asn Met Thr Ala Phe His Arg Asp Asp His Glu Thr Asp Met Glu 
40 210 215 220 

Leu Lys Pro Ala Lys Val Ala Thr Arg Thr Ser Arg Gly Trp His Thr 
225 230 235 240 

Thr Asp Leu Lys Tyr Asn Pro Ser Arg Val Glu Ala Phe His Arg Tyr 
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245 



250 



255 



Gly Thr Thr Val Asn Cys lie Val Glu Glu Val Asp Ala Arg Ser Val 

260 265 270 

Tyr Pro Tyr Asp Glu Phe Val Leu Ala Thr Gly Asp Phe Val Tyr Met 

275 280 285 
Ser Pro Phe Tyr Gly Tyr Arg Glu Gly Ser His Thr Glu His Thr Ser 

290 295 300 

Tyr Ala Ala Asp Arg Phe Lys Gin Val Asp Gly Phe Tyr Ala Arg Asp 



10 Leu Thr Thr Lys Ala Arg Ala Thr Ser Pro Thr Thr Arg Asn Leu Leu 

325 330 335 

Thr Thr Pro Lys Phe Thr Val Ala Trp Asp Trp Val Pro Lys Arg Pro 

340 345 350 

Ala Val Cys Thr Met Thr Lys Trp Gin Glu Val Asp Glu Met Leu Arg 
15 355 360 365 

Ala Glu Tyr Gly Gly Ser Phe Arg Phe Ser Ser Asp Ala lie Ser Thr 

370 375 380 

Thr Phe Thr Thr Asn Leu Thr Gin Tyr Ser Leu Ser Arg Val Asp Leu 
385 390 395 400 

20 Gly Asp Cys He Gly Arg Asp Ala Arg Glu Ala He Asp Arg Met Phe 

405 410 415 

Ala Arg Lys Tyr Asn Ala Thr His He Lys Val Gly Gin Pro Gin Tyr 

420 425 430 

Tyr Leu Ala Thr Gly Gly Phe Leu He Ala Tyr Gin Pro Leu Leu Ser 
25 435 440 445 

Asn Thr Leu Ala Glu Leu Tyr Val Arg Glu Tyr Met Arg Glu Gin Asp 

450 455 460 

Arg Lys Pro Arg Asn Ala Thr Pro Ala Pro Leu Arg Glu Ala Pro Ser 
465 470 475 480 

30 Ala Asn Ala Ser Val Glu Arg He Lys Thr Thr Ser Ser lie Glu Phe 

485 490 495 

Ala Arg Leu Gin Phe Thr Tyr Asn His He Gin Arg His Val Asn Asp 

500 505 510 

Met Leu Gly Arg He Ala Val Ala Trp Cys Glu Leu Gin Asn His Glu 
35 515 520 525 

Leu Thr Leu Trp Asn Glu Ala Arg Lys Leu Asn Pro Asn Ala He Ala 

530 535 540 

Ser Ala Thr Val Gly Arg Arg Val Ser Ala Arg Met Leu Gly Asp Val 
545 550 555 560 

40 Met Ala Val Ser Thr Cys Val Pro Val Ala Pro Asp Asn Val He Val 

565 570 575 

Gin Asn Ser Met Arg Val Ser Ser Arg Pro Gly Thr Cys Arg Pro Leu 



305 



310 



315 



320 
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585 
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Val Ser Phe Arg Tyr Glu Asp Gin Gly Pro Leu lie Glu Gly Gin Leu 

595 600 605 

Gly Glu Asn Asn Glu Leu Arg Leu Thr Arg Asp Ala Leu Glu Pro Cys 
610 615 620 

5 Thr Val Gly His Arg Arg Tyr Phe lie Phe Gly Gly Gly Tyr Val Tyr 
625 630 635 640 

Phe Glu Glu Tyr Ala Tyr Ser His Gin Leu Ser Arg Ala Asp Val Thr 

645 650 655 

Thr Val Ser Thr Phe lie Asp Leu Asn lie Thr Met Leu Glu Asp His 
10 660 665 670 

Glu Phe Val Pro Leu Glu Val Tyr Thr Arg His Glu lie Lys Asp Ser 

675 680 685 

Gly Leu Leu Asp Tyr Thr Glu Val Gin Arg Arg Asn Gin Leu His Asp 
690 695 700 

15 Leu Arg Phe Ala Asp lie Asp Thr Val lie Arg Ala Asp Ala Asn Ala 
705 710 715 720 

Ala Met Phe Ala Gly Leu Cys Ala Phe Phe Glu Gly Met Gly Asp Leu 

725 730 735 

Gly Arg Ala Val Gly Lys Val Val Met Gly Val Val Gly Gly Val Val 
20 740 745 750 

Ser Ala Val Ser Gly Val Ser Ser Phe Met Ser Asn Pro Phe Gly Ala 

755 760 765 

Val Gly Leu Leu Val Leu Ala Gly Leu Val Ala Ala Phe Phe Ala Phe 
770 775 780 

25 Arg Tyr Val Leu Gin Leu Gin Arg Asn Pro Met Lys Ala Leu Tyr Pro 
785 790 795 800 

Leu Thr Thr Lys Glu Leu Lys Thr Ser Asp Pro Gly Gly Val Gly Gly 

805 810 815 

Glu Gly Glu Glu Gly Ala Glu Gly Gly Gly Phe Asp Glu Ala Lys Leu 
30 820 825 830 

Ala Glu Ala Arg Glu Met lie Arg Tyr Met Ala Leu Val Ser Ala Met 

835 840 845 

Glu Arg Thr Glu His Lys Ala Arg Lys Lys Gly Thr Ser Ala Leu Leu 
850 855 860 

35 Ser Ser Lys Val Thr Asn Met Val Leu Arg Lys Arg Asn Lys Ala Arg 
865 870 875 880 

Tyr Ser Pro Leu His Asn Glu Asp Glu Ala Gly Asp Glu Asp Glu Leu 
885 890 895 

40 (2) INFORMATION FOR SEQ ID NO: 242: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 69 amino acids 

610 



WO 98/20016 



PCT/US97/20016 



(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:242: 

Val Val Ala Gly Leu Gly Thr Gly Gly Gly Arg Glu Ala Gly Pro Pro 
10 1 5 10 15 

Phe Ala Ala Thr Val Ala Ala Thr Pro Pro Glu Arg Ala Ala Gly Ala 

20 25 30 

Ala Gly Ala Ala Asp Ala Thr Ala Ala Thr Ser Ala Pro Thr Thr Ser 
35 40 45 

15 Ala Gin lie Lys Pro Pro Pro Arg Met Ala Gly Leu Arg Gly Arg Val 
50 55 60 

Ala Pro Ala Ala Arg 
65 

20 (2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 773 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide * • 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 243: 

Met Ala Ala Ala Pro Pro Ala Ala Val Ser Glu Pro Thr Ala Ala Arg 

15 10 15 

Gin Lys Leu Leu Ala Leu Leu Gly Gin Val Gin Thr Tyr Val Phe Gin 
35 20 25 30 

Leu Glu Leu Leu Arg Arg Cys Asp Pro Gin lie Gly Leu Gly Lys Leu 

35 40 45 

Ala Gin Leu Lys Leu Asn Ala Leu Gin Val Arg Val Leu Arg Arg His 
50 55 60 

40 Leu Arg Pro Gly Leu Glu Ala Gin Ala Ala Ala Phe Leu Thr Pro Leu 
65 70 75 80 

Ser Val Thr Leu Glu Leu Leu Leu Glu Tyr Ala Trp Arg Glu Gly Glu 
85 90 95 
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Arg Leu Leu Gly His Leu Glu Thr Phe Ala Thr Thr Gly Asp Val Ser 

100 105 110 

Ala Phe Phe Thr Glu Thr Met Gly Leu Ala Arg Pro Cys Pro Tyr His 
115 120 125 

5 Gin Gin lie Arg Leu Glu Thr Tyr Gly Gly Asp Val Arg Met Glu Leu 
130 135 140 

Cys Phe Leu His Asp Val Glu Asn Phe Leu Lys Gin Leu Asn Tyr Cys 
145 150 155 160 

His Leu He Thr Pro Pro Ser Gly Ala Thr Ala Ala Leu Glu Arg Val 
10 165 170 175 

Arg Glu Phe Met Val Ala Ala Val Gly Ser Gly Leu He Val Pro Pro 

180 185 190 

Glu Leu Ser Asp Pro Ser His Pro Cys Ala Val Cys Phe Glu Glu Leu 
195 200 205 

15 Cys Val Thr Ala Asn Gin Gly Ala Thr He Ala Arg Arg Leu Ala Asp 
210 215 220 

Arg He Cys Asn His Val Thr Gin Gin Ala Gin Val Arg Leu Asp Ala 
225 230 235 240 

Asn Glu Leu Arg Arg Tyr Leu Pro His Ala Ala Gly Leu Ser Asp Ala 
20 245 250 255 

Ala Arg Ala Arg Ala Leu Cys Val Leu Asp Gin Ala Arg Thr Ala Ala 

260 265 270 

Gly Gly Gly Ala Arg Ala Gly Pro Pro Pro Ala Asp Ser Ser Ser Val 
275 280 285 

25 Arg Glu Glu Ala Asp Ala Leu Leu Glu Ala His Asp Val Phe Gin Ala 
290 295 300 

Thr Thr Pro Gly Ala lie Ser Glu Leu Arg Phe Trp Leu Ala Ser Gly 
305 310 315 320 

Asp Arg Ala Arg His Ser Thr Met Asp Ala Phe Ala Asp Asn Leu Asn 
30 325 330 335 

Ala Gin Arg Glu Leu Gin Gin Glu Thr Ala Ala Val Ala Val Glu Leu 

340 345 350 

Ala Leu Phe Gly Arg Arg Ala Glu His Phe Asp Arg Ala Phe Gly Gly 
355 360 365 

35 His Leu Ala Ala Leu Asp Met Val Asp Ala Leu lie lie Gly Gly Gin 
- 370 375 380 

Ala Thr Ser Pro Asp Asp Gin lie Glu Ala Leu lie Arg Ala Cys Tyr 
385 390 395 400 

Asp His His Leu Thr Thr Pro Leu Leu Arg Arg Leu Val Ser Pro Glu 
40 405 410 415 

Gin Cys Asp Glu Glu Ala Leu Arg Arg Val Leu Ala Arg Leu Gly Ala 

420 425 430 

Gly Gly Ala Thr Gly Gly Ala Glu Glu Glu Glu Pro Arg Ala Ala Ala 
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435 440 445 

Glu Glu Gly Gly Arg Arg Arg Gly Ala Gly Thr Pro Ala Ser Glu Asp 

450 455 460 

Gly Glu Arg Gly Pro Glu Pro Gly Ala Gin Gly Pro Glu Ser Trp Gly 
5 465 470 475 480 

Asp lie Ala Thr Arg Ala Ala Ala Asp Val Pro Glu Arg Arg Arg Leu 

485 490 495 

Tyr Ala Asp Arg Leu Thr Lys Arg Ser Leu Ala Ser Leu Gly Arg Cys 
500 505 510 

10 Val Arg Glu Gin Arg Gly Glu Leu Glu Lys Met Leu Arg Val Ser Val 
515 52.0 525 

His Gly Glu Val Leu Pro Ala Thr Phe Ala Ala Val Ala Asn Gly Phe 

530 535 540 

Ala Ala Arg Ala Arg Phe Cys Ala Leu Thr Ala Gly Ala Gly Thr Val 
15 545 550 555 560 

lie Asp Asn Arg Ala Ala Pro Gly Val Phe Asp Ala His Arg Phe Met 

565 570 575 

Arg Ala Ser Leu Leu Arg His Gin Val Asp Pro Ala Leu Leu Pro Ser 
580 585 590 

20 He Thr Phe Phe Glu Leu Val Asn Gly Pro Leu Phe Asp His Ser Thr 
595 600 605 

His Ser Phe Ala Gin Pro Pro Asn Thr Ala Leu Tyr Tyr Ser Val Glu 

610 615 620 

Asn Val Gly Leu Leu Pro His Leu Lys Glu Glu Leu Ala Arg Phe He 
25 625 630 635 640 

Met Gly Ala Gly Gly Ser Gly Ala Asp Trp Ala Val Ser Glu Phe Gin 

645 650 655 

Lys Phe Tyr Cys • Phe Asp Gly Val Ser Gly He Thr Pro Thr Gin Arg 
660 665 670 

30 Ala Ala Trp Arg Tyr He Arg Glu Leu He He Ala Thr Thr Leu Phe 
675 680 685 

Ala Ser Val Tyr Arg Cys Gly Glu Leu Glu Leu Arg Arg Pro Asp Cys 

690 695 700 

Ser Arg Pro Thr Ser Glu Gly Arg Tyr Pro Pro Gly Val Tyr Leu Thr 
35 705 710 715 720 

Tyr Asn Ser Asp Cys Pro Leu Val Ala lie Val Glu Ser Gly Pro Asp 

725 730 735 

Gly Cys He Gly Pro Arg Ser Val Val Val Tyr Asp Arg Asp Val Phe 
740 745 750 

40 Ser He Lys Val Leu Gin His Leu Ala Pro Arg Leu Ala Gly Gly Gly 
755 760 765 

Ser Asp Ala Pro Pro 
770 
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(2) INFORMATION FOR SEQ ID NO: 244: 

( i ) SEQUENCE CHARACTERISTICS : 
5 . (A) LENGTH: 616 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

Met Asp Thr Lys Pro Lys Thr Thr Thr Thr Val Lys Val Pro Pro Gly 
15 1 5 10 15 

Pro Met Gly Tyr Val Tyr Gly Arg Ala Cys Pro Ala Glu Gly Leu Glu 

20 25 30 

Leu Leu Ser Leu Leu Ser Ala Arg Ser Gly Asp Ala Asp Val Ala Val 
35 40 • 45 

20 Ala Pro Leu He Val Gly Leu Thr Val Glu Ser Gly Phe Glu Ala Asn 
50 55 60 

Val Ala Ala Val Val Gly Ser Arg Thr Thr Gly Leu Gly Gly Thr Ala 
65 70 75 80 

Val Ser Leu Lys Leu Met Pro Ser His Tyr Ser Pro Ser Val Tyr Val 
25 85 90 95 

Phe His Gly Gly Arg His Leu Ala Pro Ser Thr Gin Ala Pro Asn Leu 

100 105 110 

Thr Arg Leu Cys Glu Arg Ala Arg Arg His Phe Gly Phe Ser Asp Tyr 
115 120 125 

30 Ala Pro Arg Pro Cys Asp Leu Lys His Glu Thr Thr Gly Asp Ala Leu 
130 135 140 

Cys Glu Arg Leu Gly Leu Asp Pro Asp Arg Ala Leu Leu Tyr Leu Val 
145 150 155 160 

He Thr Glu Gly Phe Arg Glu Ala Val Cys He Ser Asn Thr Phe Leu 
35 165 170 175 

His Leu Gly Gly Met Asp Lys Val Thr He Gly Asp Ala Glu Val His 

180 185 190 

Arg lie Pro Val Tyr Pro Leu Gin Met Phe Met Pro Asp Phe Ser Arg 
195 200 205 

40 Val He Ala Asp Pro Phe Asn Cys Asn His Arg Ser He Gly Glu Asn 
210 215 220 

. Phe Asn Tyr Pro Leu Pro Phe Phe Asn Arg Pro Leu Ala Arg Leu Leu 
225 230 235 240 
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Phe Glu Ala Val Val Gly Pro Ala Ala Val Arg Ala Arg Asn Val Asp 

245 250 255 

Ala Val Ala Arg Ala Ala Ala His Leu Ala Phe Asp Glu Asn His Glu 
260 265 270 

5 Gly Ala Ala Leu Pro Ala Asp lie Thr Phe Thr Ala Phe Glu Ala Ser 
275 280 285 

Gin Gly Lys Pro Gin Arg Gly Ala Arg Asp Ala Gly Asn Lys Gly Pro 

290 295 300 

Ala Gly Gly Phe Glu Gin Arg Leu Ala Ser Val Met Ala Gly Asp Ala 
10 305 310 315 320 

Ala Leu Glu Ser lie Val Ser Met Ala Val Phe Asp Glu Pro Pro Pro 

325 330 335 

Asp He Thr Thr Trp Pro Leu Leu Glu Gly Gin Glu Thr Pro Ala Ala 
340 345 350 

15 Arg Ala Gly Ala Val Gly Ala Tyr Leu Ala Arg Ala Ala Gly Leu Val 
355 360 365 

Gly Ala Met Val Phe Ser Thr Asn Ser Ala Leu His Leu Thr Glu Val 

370 375 380 

Asp Asp Ala Gly Pro Ala Asp Pro Lys Asp His Ser Lys Pro Ser Phe 
20 385 390 395 400 

Tyr Arg Phe Phe Leu Val Pro Gly Thr His Val Ala Ala Asn Pro Gin 

405 410 415 

Leu Asp Arg Glu Gly His Val Val Pro Gly Tyr Glu Gly Arg Pro Thr 
420 425 430 

25 Ala Pro Leu Val Gly Gly Thr Gin Glu Phe Ala Gly Glu His Leu Ala 
435 440 445 

Met Leu Cys Gly Phe Ser Pro Ala Leu Leu Ala Lys Met Leu Phe Tyr 

450 455 460 

Leu Glu Arg Cys Asp Gly Gly Val He Val Gly Arg Gin Glu Met Asp 
30 465 470 475 480 

Val Phe Arg Tyr Val Ala Asp Ser Gly Gin Thr Asp Val Pro Cys Asn 

485 490 495 

Leu Cys Thr Phe Glu Thr Arg His Ala Cys Ala His Thr Thr Leu Met 
500 505 510 

35 Arg Leu Arg Ala Arg His Pro Lys Phe Ala Ser Ala Arg Ala He Gly 
515 520 525 

Val Phe Gly Thr Met Asn Ser Ala Tyr Ser Asp Cys Asp Val Leu Gly 

530 535 540 

Asn Tyr Ala Ala Phe Ser Ala Leu Lys Arg Ala Asp Gly Ser Glu Asn 
40 545 550 555 560 

Thr Arg Thr He Met Gin Glu Tyr Ala Ala Thr Glu Arg Val Met Ala 

565 570 575 

Glu Leu Glu Ala Leu Gin Tyr Val Asp Gin Ala Val Pro Thr Ala Leu 
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580 585 590 

Gly Arg Leu Glu Thr lie He Gly Thr Arg Glu Ala Leu His Thr Val 

595 600 605 

Val Asn Asn He Lys Gin Leu Val 
5 610 615 

(2) INFORMATION FOR SEQ ID NO: 245: 

<i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 616 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:245: 

Met Asp Thr Lys Pro Lys Thr Thr Thr Thr Val Lys Val Pro Pro Gly 
20 1 5 10 15 

Pro Met Gly Tyr Val Tyr Gly Arg Ala Cys Pro Ala Glu Gly Leu Glu 

20 25 30 

Leu Leu Ser Leu Leu Ser Ala Arg Ser Gly Asp Ala Asp Val Ala Val 
35 40 45 

25 Ala Pro Leu He Val Gly Leu Thr Val Glu Ser Gly Phe Glu Ala Asn 
50 55 60 

Val Ala Ala Val Val Gly Ser Arg Thr Thr Gly Leu Gly Gly Thr Ala 
65 70 75 80 . 

Val Ser Leu Lys Leu Met Pro Ser His Tyr Ser Pro Ser Val Tyr Val 
30 85 90 95 

Phe His Gly Gly Arg His Leu Ala Pro Ser Thr Gin Ala Pro Asn Leu 

100 105 HO 

Thr Arg Leu Cys Glu Arg Ala Arg Arg His Phe Gly Phe Ser Asp Tyr 
115 120 125 

35 Ala Pro Arg Pro Cys Asp Leu Lys His Glu Thr Thr Gly Asp Ala Leu 
130 135 140 

Cys Glu Arg Leu Gly Leu Asp Pro Asp Arg Ala Leu Leu Tyr Leu Val 
145 150 155 160 

He Thr Glu Gly Phe Arg Glu Ala Val Cys He Ser Asn Thr Phe Leu 
40 165 170 175 

His Leu Gly Gly Met Asp Lys Val Thr He Gly Asp Ala Glu Val His 

180 185 190 

Arg He Pro Val Tyr Pro Leu Gin Met Phe Met Pro Asp Phe Ser Arg 
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195 200 205 

Val He Ala Asp Pro Phe Asn Cys Asn His Arg Ser He Gly Glu Asn 

210 215 220 

Phe Asn Tyr Pro Leu Pro Phe Phe Asn Arg Pro Leu Ala Arg Leu Leu 
5 225 230 235 240 

Phe Glu Ala Val Val Gly Pro Ala Ala Val, Arg Ala Arg Asn Val Asp 

245 250 255 

Ala Val Ala Arg Ala Ala Ala His Leu Ala Phe Asp Glu Asn His Glu 
260 265 270 

10 Gly Ala Ala Leu Pro Ala Asp He Thr Phe Thr Ala Phe Glu Ala Ser 
275 280 285 

Gin Gly Lys Pro Gin Arg Gly Ala Arg Asp Ala Gly Asn Lys Gly Pro 

290 295 300 

Ala Gly Gly Phe Glu Gin Arg Leu Ala Ser Val Met Ala Gly Asp Ala 
15 305 310 315 320 

Ala Leu Glu Ser He Val Ser Met Ala Val Phe Asp Glu Pro Pro Pro 

325 330 335 

Asp He Thr Thr Trp Pro Leu Leu Glu Gly Gin Glu Thr Pro Ala Ala 
340 345 350 

20 Arg Ala Gly Ala Val Gly Ala Tyr Leu Ala Arg Ala Ala Gly Leu Val 
355 360 365 

Gly Ala Met Val Phe Ser Thr Asn Ser Ala Leu His Leu Thr Glu Val 

370 375 380 

Asp Asp Ala Gly Pro Ala Asp Pro Lys Asp His Ser Lys Pro Ser Phe 
25 385 390 395 400 

Tyr Arg Phe Phe Leu Val Pro Gly Thr His Val Ala Ala Asn Pro Gin 

405 410 415 

Leu Asp Arg Glu Gly His Val Val Pro Gly Tyr Glu Gly Arg Pro Thr 
420 425 430 

30 Ala Pro Leu Val Gly Gly Thr Gin Glu Phe Ala Gly Glu His Leu Ala 
435 440 445 

Met Leu Cys Gly Phe Ser Pro Ala Leu Leu Ala Lys Met Leu Phe Tyr 

450 455 460 

Leu Glu Arg Cys Asp Gly Gly Val He Val Gly Arg Gin Glu Met Asp 
35 465 470 475 480 

Val Phe Arg Tyr Val Ala Asp Ser Gly Gin Thr Asp Val Pro Cys Asn 

485 490 495 

Leu Cys Thr Phe Glu Thr Arg His Ala Cys Ala His Thr Thr Leu Met 
500 505 510 

40 Arg Leu Arg Ala Arg His Pro Lys Phe Ala Ser Ala Arg Ala He Gly 
515 520 525 

Val Phe Gly Thr Met Asn Ser Ala Tyr Ser Asp Cys Asp Val Leu Gly 
530 535 540 
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Asn Tyr Ala Ala Phe Ser Ala Leu Lys Arg Ala Asp Gly Ser Glu Asn 
545 550 555 560 

Thr Arg Thr lie Met Gin Glu Tyr Ala Ala Thr Glu Arg Val Met Ala 
565 570 575 

5 Glu Leu Glu Ala Leu Gin Tyr Val Asp Gin Ala Val Pro Thr Ala Leu 
580 585 590 

Gly Arg Leu Glu Thr lie lie Gly Thr Arg Glu Ala Leu His Thr Val 

595 600 605 

Val Asn Asn lie Lys Gin Leu Val 
10 610 615 

(2) INFORMATION FOR SEQ ID NO: 246: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1228 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 

Met Phe Cys Ala Ala Gly Gly Pro Thr Ser Pro Gly Gly Lys Ser Ala 
25 1 5 10 15 

Ala Arg Ala Ala Ser Gly Phe Phe Ala Pro His Asn Pro Arg Gly Ala 

20 25 30 

Thr Gin Thr Ala Pro Pro Pro Cys Arg Arg Gin Asn Phe Tyr Asn Pro 
35 40 45 

30 His Leu Ala Gin Thr Gly Thr Gin Pro Lys Ala Pro Gly Pro Ala Gin 
50 55 60 

Arg His Thr Tyr Tyr Ser Glu Cys Asp Glu Phe Arg Phe lie Ala Pro 
65 70 75 80 

Arg Ser Leu Asp Glu Asp Ala Pro Ala Glu Gin Arg Thr Gly Val His 
35 85 90 95 

Asp Gly Arg Leu Arg Arg Ala Pro Lys Val Tyr Cys Gly Gly Asp Glu 

100 105 110 

Arg Asp Val Leu Arg Val Gly Pro Glu Gly Phe Trp Pro Arg Arg Leu 
115 120 125 

40 Arg Leu Trp Gly Gly Ala Asp His Ala Pro Glu Gly Phe Asp Pro Thr 
130 135 140 

Val Thr Val Phe His Val Tyr Asp lie His Val Glu His Ala Tyr Ser 
145 150 155 160 
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Met Arg Ala Ala Gin Leu His Glu Arg Phe Met Asp Ala lie Thr Pro 

165 170 175 

Ala Gly Thr Val He Thr Leu Leu Gly Leu Thr Pro Glu Gly His Arg 
180 185 190 

5 Val Ala Val His Val Tyr Gly Thr Arg Gin Tyr Phe Tyr Met Asn Lys 
195 200 205 

Ala Glu Val Asp Arg His Leu Gin Cys Arg Ala Pro Arg Asp Leu Cys 

210 215 220 

Glu Arg Leu Ala Ala Ala Leu Arg Glu Ser Pro Gly Ala Ser Phe Arg 
10 225 230 235 240 

Gly He Ser Ala Asp His Phe Glu Ala Glu Val Val Glu Arg Ala Asp 

245 250 255 

Val Tyr Tyr Tyr Glu Trp Thr Leu Tyr Tyr Arg Val Phe Val Arg Ser 
260 265 270- 

15 Gly Arg Ala Tyr Leu Cys Asp Asn Phe Cys Pro Ala He Arg Lys Tyr 
275 280 285 

Glu Gly Gly Val Asp Ala Thr Thr Arg Phe lie Leu Asp Asn Pro Gly 

290 295 300 

Phe Val Thr Phe Gly Trp Tyr Arg Leu Lys Pro Gly Arg Gly Asn Ala 
20 305 310 315 320 

Pro Ala Gin Pro Arg Pro Pro Thr Ala Phe Gly Thr Ser Ser Asp Val 

325 330 335 

Glu Phe Asn Cys Thr Ala Asp Asn Leu Ala Val Glu Gly Ala Met Cys 
340 345 350 

25 Asp Leu Pro Ala Tyr Lys Leu Met Cys Phe Asp He Glu Cys Lys Ala 
355 360 365 

Gly Gly Glu Asp Glu Leu Ala Phe Pro Val Ala Glu Arg Pro Glu Asp 

370 375 380 

Leu Val He Gin He Ser Cys Leu Leu Tyr Asp Leu Ser Thr Thr Ala 
30 385 390 395 400 

Leu Glu His He Leu Leu Phe Ser Leu Gly Ser Cys Asp Leu Pro Glu 

405 410 415 

Ser His Leu Ser Asp Leu Ala Ser Arg Gly Leu Pro Ala Pro Val Val 
420 425 430 

35 Leu Glu Phe Asp Ser Glu Phe Glu Met Leu Leu Ala Phe Met Thr Phe 
435 440 445 

Val Lys Gin Tyr Gly Pro Glu Phe Val Thr Gly Tyr Asn He He Asn 

450- 455 460 

Phe Asp Trp Pro Phe Val Leu Thr Lys Leu Thr Glu He Tyr Lys Val 
40 465 470 475 480 

Pro Leu Asp Gly Tyr Gly Arg Met Asn Gly Arg Gly Val Phe Arg Val 

485 490 495 

Trp Asp He Gly Gin Ser His Phe Gin Lys Arg Ser Lys He Lys Val 
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500 505 510 

Asn Gly Met Val Asn lie Asp Met Tyr Gly lie lie Thr Asp Lys Val 

515 520 525 

Lys Leu Ser Ser Tyr Lys Leu Asn Ala Val Ala Glu Ala Val Leu Lys 
5 530 535 540 

Asp Lys Lys Lys Asp Leu Ser Tyr Arg Asp lie Pro Ala Tyr Tyr Ala 
545 550 555 560 

Ser Gly Pro Ala Gin Arg Gly Val He Gly Glu Tyr Cys Val Gin Asp 
565 570 575 

10 Ser Leu Leu Val Gly Gin Leu Phe Phe Lys Phe Leu Pro His Leu Glu 
580 585 590 

Leu Ser Ala Val Ala Arg Leu Ala Gly He Asn He Thr Arg Thr He 

595 600 605 

Tyr Asp Gly Gin Gin He Arg Val Phe Thr Cys Leu Leu Arg Leu Ala 
15 610 615 620 

Gly Gin Lys Gly Phe He Leu Pro Asp Thr Gin Gly Arg Phe Arg Gly 
625 630 635 640 

Leu Asp Lys Glu Ala Pro Lys Arg Pro Ala Val Pro Arg Gly Glu Gly 
645 650 655 

20 Glu Arg Pro Gly Asp Gly Asn Gly Asp Glu Asp Lys Asp Asp Asp Glu 
660 665 670 

Asp Gly Asp Glu Asp Gly Asp Glu Arg Glu Glu Val Ala Arg Glu Thr 

675 680 685 

Gly Gly Arg His Val Gly Tyr Gin Gly Ala Arg Val Leu Asp Pro Thr 
25 690 695 700 

Ser Gly Phe His Val Asp Pro Val Val Val Phe Asp Phe Ala Ser Leu 
705 710 715 720 

Tyr Pro Ser He He Gin Ala His Asn Leu Cys Phe Ser Thr Leu Ser 
725 730 735 

30 Leu Arg Pro Glu Ala Val Ala His Leu Glu Ala Asp Arg Asp Tyr Leu 
740 745 750 

Glu He Glu Val Gly Gly Arg Arg Leu Phe Phe Val Lys Ala His Val 

755 760 765 

Arg Glu Ser Leu Leu Ser He Leu Leu Arg Asp Trp Leu Ala Met Arg 
35 770 775 780 

Lys Gin He Arg Ser Arg He Pro Gin Ser Thr Pro Glu Glu Ala Val 
785 790 795 800 

Leu Leu Asp Lys Gin Gin Ala Ala He Lys Val Val Cys Asn Ser Val 
805. 810 815 

40 Tyr Gly Phe Thr Gly Val Gin His Gly Leu Leu Pro Cys Leu His Val 
820 825 830 

Ala Ala Thr Val Thr Thr He Gly Arg Glu Met Leu Leu Aia Thr Arg 
835 840 845 
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Ala Tyr Val - His Ala Arg Trp Ala Glu Phe Asp Gin Leu Leu Ala Asp 

850 855. 860 

Phe Pro Glu Ala Ala Gly Met Arg Ala Pro Gly Pro Tyr Ser Met Arg 
865 870 875 880 

5 He He Tyr Gly Asp Thr Asp Ser He Phe Val Leu Cys Arg Gly Leu 

885 890 895 

Thr Ala Ala Gly Leu Val Ala Met Gly Asp Lys Met Ala Ser His Arg 

900 905 910 

Ala Leu Phe Leu Pro Pro He Lys Leu Glu Cys Glu Lys Thr Phe Thr 
10 915 920 925 

Lys Leu Leu Leu He Ala Lys Lys Lys Tyr He Gly Val He Cys Gly 

930 935 940 

Gly Lys Met Leu He Lys Gly Val Asp Leu Val Arg Lys Asn Asn Cys 
945 950 955 960 

15 Ala Phe He Asn Arg Thr Ser Arg Ala Leu Val Asp Leu Leu Phe Tyr 

965 970 975 

Asp Asp Thr Val Ser Gly Ala Ala Ala Ala Glu Arg Pro Ala Glu Glu 

980 985 990 

Trp Leu Ala Arg Pro Leu Pro Glu Gly Leu Gin Ala Phe Gly Ala Val 
20 995 1000 1005 

Leu Val Asp Ala His Arg Arg He Thr Asp Pro Glu Arg Asp He Gin 

1010 1015 1020 

Asp Phe Val Leu Thr Ala Glu Leu Ser Arg His Pro Arg Ala Tyr Thr 
. 1025 1030 1035 104 

25 Asn Lys Arg Leu Ala His Leu Thr Val Tyr Tyr Lys Leu Met Ala Arg 

1045 1050 1055 

Arg Ala Gin Val Pro Ser He Lys Asp Arg He Pro Tyr Val He Val 

1060 1065 • 1070 

Ala Gin Thr Arg Glu Val Glu Glu Thr Val Ala Arg Leu Ala Ala Leu 
30 1075 1080 1085 

Arg Glu Leu Asp Ala Ala Ala Pro Gly Asp Glu Pro Ala Pro Pro Ala 

1090 1095 1100 

Ala Leu Pro Ser Pro Ala Lys Arg Pro Arg Glu Thr Pro Ser His Ala 
1105 1110 1115 112 

35 Asp Pro Pro Gly Gly Ala Ser Lys Pro Arg Lys Leu Leu Val Ser Glu 

1125 1130 1135 

Leu Ala Glu Asp Pro Gly Tyr Ala He Arg Val Pro Leu Asn Thr Asp 

1140 1145 1150 

Tyr Tyr Phe Ser His Leu Leu Gly Ala Ala Cys Val Thr Phe Lys Ala 
40 1155 1160 1165 

Leu Phe Gly Asn Asn Ala Lys He Thr Glu Ser Leu Leu Lys Arg Phe 

1170 1175 1180 

He Pro Glu Thr Trp His Pro Pro Asp Asp Val Ala Ala Arg Leu Arg 
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1185 119C 1195 120 

Ala Ala Gly Phe Gly Pro Ala Gly Ala Gly Ala Thr Ala Glu Glu Thr 

1205 _ 1210 1215 

Arg Arg Met Leu His Arg Ala Phe Asp Thr Leu Ala 
5 1220 1225 



(2) INFORMATION FOR SEQ ID NO: 247: 



(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 



Met Tyr Asp lie Ala Pro 
20 l 5 

Arg Asp Lys Thr Arg Arg 
20 

Gly Val Glu Arg Arg Ala 
35 

25 Arg Leu Glu Leu Cys Leu 
50 

Ala Ala Gin Thr Pro Ser 
65 70 
Val Pro Leu Val Lys Thr 
30 85 

Gin Thr Val Ala Asp Asn 
100 

Leu Gly lie Gly Gly Cys 
115 

35 Leu Ala Thr Val Ser Arg 
130 

He Asn Thr He Phe Glu 
145 150 
Leu Ala Asp Arg His Ser 
40 165 

Leu Gly Gin Pro Glu Leu 
180 

Gly Ala Cys Asp Pro Arg 



Arg Arg Ser Gly Ser Arg Pro Gly Pro Gly 

10 15 
Arg Ser Arg Phe Ser Ala Ala Gly Asn Pro 

25 30 
Ser Arg Lys Ser Leu Pro Ser His Ala Arg 

40 45 
His Glu Arg Arg Arg Tyr Arg Gly Phe Phe 
55 60 

Glu Glu He Ala He Val Arg Ser Leu Ser 

75 * 80 

Thr Pro Val Ser Leu Pro Phe Ser Leu Asp 

90 95 
Cys Leu Thr Leu Ser Gly Met Gly Tyr Tyr 

105 '110 
Cys Pro Ala Cys Ser Ala Gly Asp Gly Arg 

120 125 
Glu Ala Leu He Leu Ala Phe Val Gin Gin 
135 140 
His Arg Thr Phe Leu Ala Ser Leu Val Val 
155 160 
Thr Pro Leu Gin Asp Leu Leu Ala Asp Thr 

170 175 
Phe Phe Val His Thr He Leu Arg Gly Gly 

185 190 
Phe Leu Phe Tyr Pro Asp Pro Thr Tyr Gly 
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195 200 205 

Gly His Met Leu Tyr Val He Phe Pro Gly Thr Ser Ala His Leu His 

210 215 220 

Tyr Arg Leu He Asp Arg Met Leu Thr Ala Cys Pro Gly Tyr Arg Phe 
5 225 230 235 240 

Ala Ala His Val Trp Gin Ser Thr Phe Val Leu Val Val Arg Arg Asn 

245 250 255 

Ala Glu Lys Pro Ala Asp Ala Glu He Pro Thr Val Ser Ala Ala Asp 
260 265 270 

10 He Tyr Cys Lys Met Arg Asp He Ser Phe Asp Gly Gly Leu Met Leu 
275 280 285 

Glu Tyr Gin Arg Leu Tyr Ala Thr Phe Asp Glu Phe Pro Pro Pro 
290 295 300 

15 (2) INFORMATION FOR SEQ ID NO:248: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 590 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:248: 

Met Ala Thr Ser Ala Pro Gly Val Pro Ser Ser Ala Ala Val Arg Glu 

1 5 10 15 

Glu Ser Pro Gly Ser Ser Trp Lys Glu Gly Ala Phe Glu Arg Pro Tyr 
30 20 25 30 

Val Ala Phe Asp Pro Asp Leu Leu Ala Leu Asn Glu Ala Leu Cys Ala 

35 40 45 

Glu Leu Leu Ala Ala Cys His Val Val Gly Val Pro Pro Ala Ser Ala 
50 55 60 

35 Leu Asp Glu Asp Val Glu Ser Asp Val Ala Pro Ala Pro Pro Arg Pro 
65 70 75 80 

Arg Gly Ala Ala Arg Glu Ala Ser Gly Gly Arg Gly Pro Gly Ser Arg 

85 90 95 

Pro Pro Ala Asp Pro Thr Ala Glu Gly Leu Leu Asp Thr Gly Pro Phe 
40 100 105 HO 

Ala Ala Ala Ser Val Asp Thr Phe Ala Leu Asp Arg Pro Cys Leu Val 

115 120 125 

Cys Arg Thr He Glu Leu Tyr Lys Gin Ala Tyr Arg Leu Ser Pro Gin 
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130 135 140 

Trp Val Ala Asp Tyr Ala Phe Leu Cys Ala Lys Cys Leu Gly Ala Pro 
145 150 ^ 155 160 

His Cys Ala Ala Ser He Phe Val Ala Ala Phe Glu Phe Val Tyr Val 
5 165 170 175 

Met Asp His His Phe Leu Arg Thr Lys Lys Ala Thr Leu Val Gly Ser 

180 185 190 

Phe Ala Arg Phe Ala Leu Thr He Asn Asp He His Arg His Phe Phe 
195 200 205 

10 Leu His Cys Cys Phe Arg Thr Asp Gly Gly Val Pro Gly Arg His Ala 
210 .215 220 

Gin Lys Gin Pro Arg Pro Thr Pro Ser Pro Gly Ala Ala Lys Val Gin 
225 230 235 240 

Tyr Ser Asn Tyr Ser Phe Leu Ala Gin Ser Ala Thr Arg Ala Leu He 
15 245 250 255 

Gly Thr Leu Ala Ser Gly Gly Asp Asp Gly Ala Gly Ala Gly Gly Gly 

260 265 270 

Ser Gly Thr Gin Pro Ser Leu Thr Thr Ala Leu Met Asn Trp Lys Asp 
275 280 285 

20 Cys Ala Arg Leu Leu Asp Cys Thr Glu Gly Lys Arg Gly Gly Gly Asp 
290 295 300 

Ser Cys Cys Thr Arg Ala Ala Ala Arg Asn Gly Glu Phe Glu Ala Ala 
305 310 315 320 

Ala Gly Ala Gin Gly Gly Glu Pro Glu Thr Trp Ala Tyr Ala Asp Leu 
25 325 330 335 

He Leu Leu Leu Leu Ala Gly Thr Pro Ala Val Trp Glu Ser Gly Pro 

340 345 350 

Arg Leu Arg Ala Ala Ala Asp Ala Arg Arg Ala Ala Val Ser Glu Ser 
355 360 365 

30 Trp Glu Ala His Arg Gly Ala Arg Met Arg Asp Ala Ala Pro Arg Phe 
370 375 380 

Ala Gin Phe Ala Glu Pro Lys Ala Gin Pro Asp Leu Asp Leu Gly Pro 
385 390 395 400 

Leu Met Ala Thr Val Leu Lys His Gly Arg Gly Arg Gly Arg Thr Gly 
35 405 410 415 

Gly Glu Cys Leu Leu Cys Asn Leu Leu Leu Val Arg Ala Tyr Trp Leu 

420 425 430 

Ala Met Arg Arg Leu Arg Ala Ser Val Val Arg Tyr Ser Glu Asn Asn 
435 440 445 

40 Thr Ser Leu Phe Asp Cys He Val Pro Val Val Asp Gin Leu Glu Ala 
450 455 460 

Asp Pro Glu Ala Gin Pro Gly Asp Gly Gly Arg Phe Val Ser Leu Leu 
465 470 475 480 
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Arg Ala Ala Gly Pro Glu Ala lie Phe Lys His Met Phe Cys Asp Pro 

485 490 495 

Met Cys Ala lie Thr Glu Met Glu Val Asp Pro Trp Val Leu Phe Gly 
500 505 510 

5 His Pro Arg Ala Asp His Arg Asp Glu Leu Gin Leu His Lys Ala Lys 
515 520 525 

Leu Ala Cys Gly Asn Glu Phe Glu Gly Arg Val Cys lie Ala Leu Arg 

530 535 540 

Ala Leu He Tyr Thr Phe Lys Thr Tyr Gin Val Phe Val Pro Lys Pro 
10 545 550 555 560 

Thr Ala Thr Phe Val Arg Glu Ala Gly Ala Leu Leu Arg Arg His Ser 

565 570 575 

He Ser Leu Leu Ser Leu Glu His Thr Leu Cys Thr Tyr Val 
580 585 590 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 249: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 128 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 



Met Ala Gly Arg Ala Gly Arg Trp Arg Thr Leu Arg Asp Ala He Pro 
15 10 15 

30 Asp Cys Ala Leu Arg Ser Gin Thr Leu Glu Ser Leu Asp Ala Arg Tyr 
20 25 30 

Val Ser Arg Asp Gly Ala Gly Asp Ala Ala Val Trp Phe Glu Asp Met 

35 40 45 

Thr Pro Ala Glu Leu Glu Val He Phe Pro Thr Thr Asp Ala Lys Leu 
35 50 55 60 

Asn Tyr Leu Ser Arg Thr Gin Arg Leu Ala Ser Leu Leu Thr Tyr Ala 
65 70 75 80 

Gly Pro lie Lys Ala Pro Asp Gly Pro Ala Ala Pro His Thr Gin Asp 
85 90 95 

40 Thr Ala Cys Val His Gly Glu Leu Leu Ala Arg Lys Arg Glu Arg Phe 
100 105 110 

Ala Ala Val lie Asn Arg Phe Leu Asp Leu His Gin lie Leu Arg Gly 
115 120 125 
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(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 



Met Ala Ala Pro Gin Phe His Arg Pro Ser Thr lie Thr Ala Asp Asn 
15 1 5 10 15 

Val Arg Ala Leu Gly Met Arg Gly Leu Val Leu Ala Thr Asn Asn Ala 

20 25 30 

Gin Phe lie Met Asp Asn Ser Tyr Pro His Pro His Gly Thr Gin Gly 
35 40 45 

20 Ala Val Arg Glu Phe Leu Arg Gly Gin Ala Ala Ala Leu Thr Asp Leu 
50 55 60 

Gly Val Thr His Ala Asn Asn Thr Phe Ala Pro Gin Pro Met Phe Ala 
65 70 75 80 

Gly Asp Ala Ala Ala Glu Trp Leu Arg Pro Ser Phe Gly Leu Lys Arg 
25 85 90 95 

Thr Tyr Ser Pro Phe Val Val Arg Asp Pro Lys Thr Pro Ser Thr Pro 
100 105 110 



(2) INFORMATION FOR SEQ ID NO: 251: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:251: 

Met Ala Ala Pro Gin Phe His Arg Pro Ser Thr He Thr Ala Asp Asn 

15 10 15 

Val Arg Ala Leu Gly Met Arg Gly Leu Val Leu Ala Thr Asn Asn Ala 
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20 25 30 

Gin Phe lie Met Asp Asn Ser Tyr Pro His Pro His Gly Thr Gin Gly 

35 40 45 

Ala Val Arg Glu Phe Leu Arg Gly Gin Ala Ala Ala Leu Thr Asp Leu 
5 50 55 60 

Gly Val Thr His Ala Asn Asn Thr Phe Ala Pro Gin Pro Met Phe Ala 
65 70 75 80 

Gly Asp Ala Ala Ala Glu Trp Leu Arg Pro Ser Phe Gly Leu Lys Arg 
85 90 . 95 

10 Thr Tyr Ser Pro Phe Val Val Arg Asp Pro Lys Thr Pro Ser Thr Pro 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 2 52; 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3051 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:252: 

25 Met lie Pro Ala Ala Leu Pro His Pro Thr Met Lys Arg Gin Gly Asp 
15 10 15 

Arg Asp lie Val Val Thr Gly Val Arg Asn Gin Phe Ala Thr Asp Leu 

20 25 30 

Glu Pro Gly Gly Ser Val Ser Cys Met Arg Ser Ser Leu Ser Phe Leu 
30 35 40 45 

Ser Leu Leu Phe Asp Val Gly Pro Arg Asp Val Leu Ser Ala Glu Ala 

50 55 60 

lie Glu Gly Cys Leu Val Glu Gly Gly Glu Trp Thr Arg Ala Ala Ala 
65 70 75 80 

35 Gly Ser Gly Pro Pro Arg Met Cys Ser lie lie Glu Leu Pro Asn Phe 

85 90 95 

Leu Glu Tyr Pro Ala Arg Gly Leu Arg Cys Val Phe Ser Arg Val Tyr 

100 105 110 

Gly Glu Val Gly Phe Phe Gly Glu Pro Thr Ala Gly Leu Leu Glu Thr 
40 115 120 125 

Gin Cys Pro Ala His Thr Phe Phe Ala Gly Pro Trp Ala Met Arg Pro 

130 135 140 

Leu Ser Tyr Thr Leu Leu Thr lie Gly Pro Leu Gly Met Gly Arg Asp 
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145 150 155 160 

Gly Asp Thr Ala Tyr Leu Phe Asp Pro His Gly Leu Pro Ala Gly Thr 

165 170 175 

Pro Ala Phe lie Ala Lys Val Arg Ala Gly Asp Val Tyr Pro Tyr Leu 
5 180 185 190 

Thr Tyr Tyr Ala His Asp Arg Pro Lys Val Arg Trp Ala Gly Ala Met 

195 200 205 

Val Phe Phe Val Pro Ser Gly Pro Gly Ala Val Ala Pro Ala Asp Leu 
210 215 220 

10 Thr Ala Ala Ala Leu His Leu Tyr Gly Ala Ser Glu Thr Tyr Leu Gin 
225 230 235 240 

Asp Glu Pro Phe Val Glu Arg Arg Val Ala lie Thr His Pro Leu Arg 

245 250 255 

Gly Glu He Gly Gly Leu Gly Ala Leu Phe Val Gly Val Val Pro Arg 
15 260 265 270 

Gly Asp Gly Glu Gly Ser Gly Pro Val Val Pro Ala Leu Pro Ala Pro 

275 280 285 

Thr His Val Gin Thr Pro Arg Ala Asp Arg Pro Pro Glu Ala Pro Arg 
290 295 300 

20 Gly Ala Ser Gly Pro Pro Asn Thr Pro Gin Ala Gly His Pro Asn Arg 
305 310 315 320 

Pro Pro Asp Asp Val Trp Ala Ala Ala Leu Glu Gly Thr Pro Pro Ala 

325 330 335 

Lys Pro Ser Ala Pro Asp Ala Ala Ala Ser Gly Pro Pro His Ala Ala 
25 340 345 350 

Pro Pro Pro Gin Thr Pro Ala Gly Asp Ala Ala Glu Glu Ala Glu Asp 

355 360 365 

Leu Arg Val Leu Glu Val Gly Ala Val Pro Val Gly Arg His Arg Ala 
370 375 380 

30 Arg Tyr Ser Thr Gly Leu Pro Lys Arg Arg Arg Pro Thr Trp Thr Pro 
385 390 395 400 

Pro Ser Ser Val Glu Asp Leu Thr Ser Gly Glu Arg Pro Ala Pro Lys 

405 410 415 

Ala Pro Pro Ala Lys Ala Lys Lys Lys Ser Ala Pro Lys Lys Lys Ala 
35 420 425 430 

Pro Val Ala Ala Glu Val Pro Ala Ser Ser Pro Thr Pro He Ala Ala 

435 440 445 

Thr Val Pro Pro Ala Pro Asp Thr Pro Pro Gin Ser Gly Gin Gly Gly 
450 455 460 

40 Gly Asp Asp Gly Pro Asp Ser Ser Pro Ser Val Leu Glu Thr Leu Gly 
465 470 475 480 

Ala Arg Arg Pro Pro Glu Pro Pro Gly Ala Asp Leu Ala Gin Leu Phe 
485 490 495 
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Glu Val His Pro Asn Val Ala Ala Thr Ala Val Arg Leu Ala Ala Arg 

500 505 510 

Asp Ala Ala Arg Glu Val Ala Ala Cys Ser Gin Leu Thr lie Asn Ala 
515 520 525 

5 Leu Arg Ser Pro Tyr Pro Ala His Pro Gly Leu Leu Glu Leu Cys Val 
530 535 540 

lie Phe Phe Phe Glu Arg Val Leu Ala Phe Leu lie Glu Asn Gly Ala 
545 550 555 560 

Arg Thr His Thr Gin Ala Gly Val Ala Gly Pro Ala Ala Ala Leu Leu 
10 565 570 575 

Asp Phe Thr Leu Arg Met Leu Pro Arg Lys Thr Ala Val Gly Asp Phe 

580 585 590 

Leu Ala Ser Thr Arg Met Ser Leu Ala Asp Val Ala Ala His Arg Pro 
595 600 605 

15 Leu lie Gin His Val Leu Asp Glu Asn Ser Gin lie Gly Arg Leu Ala 
610 615 620 

Lys Leu Val Leu Val Ala Arg Asp Val lie Arg Glu Thr Asp Ala Phe 
625 630 635 640 

Tyr Gly Asp Leu Ala Asp Leu Asp Leu Gin Leu Arg Ala Ala Pro Pro 
20 645 650 655 

Ala Asn Leu Tyr Ala Arg Leu Gly Glu Trp Leu Leu Glu Arg Ser Arg 

660 665 670 

Ala His Pro Asn Thr Leu Phe Ala Pro Ala Thr Pro Thr His Pro Glu 
675 680 685 

25 Pro Leu Leu His Arg lie Gin Ala Gin Phe Arg Glu Glu Met Arg Val 
690 695 700 

Glu Ala Glu Ala Arg Glu Met Arg Glu Ala Leu Asp Arg Val Asp Ser 
705 710 715 • 720 

Val Ser Gin Arg Ala Gly Pro Leu Thr Val Met Pro Val Pro Ala Ala 
30 725 730 735 

Pro Gly Ala Gly Gly Arg Ala Pro Cys Pro Pro Ala Leu Gly Pro Glu 

740 745 . 750 

Ala He Gin Ala Arg Leu Glu Asp Val Arg He Gin Ala Arg Arg Ala 
755 760 765 

35 He Glu Ser Ala lie Lys Glu Tyr Phe His Arg Gly Ala Val Tyr Ser 
770 775 780 

Ala Lys Ala Leu Gin Ala Ser Asp Ser His Asp Cys Arg Phe His Val 
785 790 795 800 

Ala Ser Ala Ala Val Val Pro Met Val Gin Leu Leu Glu Ser Leu Pro 
40 805 810 815 

Ala Phe Asp Gin His Thr Arg Asp Val Ala Gin Arg Ala Ala Leu Pro 

820 825 830 

Pro Pro Pro Pro Leu Ala Thr Ser Pro Gin Ala He Leu Leu Arg Asp 
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835 840 845 

Leu Leu Gin Arg Gly Gin Thr Leu Asp Ala Pro Glu Asp Leu Ala Ala 

850 855 860 

Trp Leu Ser Val Leu Thr Asp Ala Ala Thr Gin Gly Leu He Glu Arg 
. 5 865 870 875 880 

Lys Pro Leu Glu Glu Leu Ala Arg Ser He His Gly He Asn Asp Gin 

885 890 895 

Gin Ala Arg Arg Ser Ser Gly Leu Ala Glu Leu Gin Arg Phe Asp Ala 
900 905 910 

10 Leu Asp Ala Ala Gin Gin Leu Asp Ser Asp Ala Ala Phe Val Pro Ala 
.915 . 920 925 

Thr Gly Pro Ala Pro Tyr Val Asp Gly Gly Gly Leu Ser Pro Glu Ala 

930 935 940 

Thr Arg Met Ala Glu Asp Ala Leu Arg Gin Ala Arg Ala Met Glu Ala 
15 945 950 955 960 

Ala Lys Met Thr Ala Glu Leu Ala Pro Glu Ala Arg Ser Arg Leu Arg 

965 970 975 

Glu Arg Ala His Ala Leu Glu Ala Met Leu Asn Asp Ala Arg Glu Arg 
980 985 990 

20 Ala Lys Val Ala His Asp Ala Arg Glu Lys Phe Leu His Lys Leu Gin 
995 1000 1005 

Gly Val Leu Arg Pro Leu Pro Asp Phe Val Gly Leu Lys Ala Cys Pro 

1010 1015 1020 

Ala Val Leu Ala Thr Leu Arg Ala Ser Leu Pro Ala Gly Trp Thr Asp 
25 1025 1030 1035 104 

Leu Ala Asp Ala Val Arg Gly Pro Pro Pro Glu Val Thr Ala Ala Leu 

1045 1050 1055 

Arg Ala Asp Leu Trp Gly Leu Leu Gly Gin Tyr Arg Glu Ala Leu Glu 
1060 1065 1070 

30 His Pro Thr Pro Asp Thr Ala Thr Ala Gly Leu His Pro Ala Phe Val 
1075 1080 1085 

Val Val Leu Lys Thr Leu Phe Ala Asp Ala Pro Glu Thr Pro Val Leu 

1090 1095 1100 

Val Gin Phe Phe Ser Asp His Ala Pro Thr He Ala Lys Ala Val Ser 
35 1105 1110 ins 112 

Asn Ala He Asn Ala Gly Ser Ala Ala Val Ala Thr Asp Ala Ala Thr 

1125 1130 H35 

Val Asp Ala Ala Val Arg Ala His Gly Ala Asp Ala Val Ser Ala Leu 
1140 1145 H50 

40 Gly Ala Ala Ala Arg Asp Pro Asp Leu Ser Phe Leu Ala Ala Asp Ser 
1155 1160 H65 

Ala Ala Gly Tyr Val Lys Ala Thr Arg Leu Ala Leu Glu Arg Ala He 
1170 1175 H80 
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Asp Glu Leu Thr Thr Leu Gly Ser Ala Ala Ala Asp Leu Val Val Gin 
1185 1190 1195 120 

Ala Arg Arg Ala Cys Ala Gin Pro Glu Gly Asp His Ala Ala Leu lie 
1205 1210 1215 

5 Asp Ala Ala Ala Arg Ala Thr Thr Ala Ala Arg Glu Ser Leu Ala Gly 
1220 1225 1230 

His Glu Ala Gly Phe Gly Gly Leu Leu His Ala Glu Gly Thr Ala Gly 

1235 1240 1245 

Asp His Ser Pro Ser Gly Arg Ala Leu Gin Glu Leu Gly Lys Val lie 
10 1250 1255 1260 

Gly Ala Thr Arg Arg Arg Ala Asp Glu Leu Glu Ala Ala Val Ala Asp 
1265 1270 1275 128 

Leu Thr Ala Lys Met Ala Ala Gin Arg Arg Ser Ser Trp Ala Ala Gly 
1285 1290 1295 

15 Val Glu Ala Ala Leu Asp Arg Val Glu Asn Arg Ala Glu Phe Asp Val 
1300 1305 1310 

Val Glu Leu Arg Arg Leu Gin Ala Gly Thr His Gly Tyr Asn Pro Arg 

1315 1320 1325 

Asp Phe Arg Lys Arg Ala Glu Gin Ala Ala Asn Ala Glu Ala Val Thr 
20 1330 1335 1340 

Leu Ala Leu Asp Thr Ala Phe Ala Phe Asn Pro Tyr Thr Pro Glu Asn 
1345 1350 1355 136 

Gin Arg His Pro Met Leu Pro Pro Leu Ala Ala lie His Arg Leu Gly 
1365 1370 1375 

25 Trp Ser Ala Ala Phe His Ala Ala Ala Glu Thr Tyr Ala Asp Met Phe 
1380 1385 1390 

Arg Val Asp Ala Glu Pro Leu Ala Arg Leu Leu Arg lie Ala Glu Gly 

1395 1400 1405 

Leu Leu Glu Met Ala Gin Ala Gly Asp Gly Phe lie Asp Tyr His Glu 
30 1410 1415 1420 

Ala Val Gly Arg Leu Ala Asp Asp Met Thr Ser Val Pro Gly Leu Arg 
1425 1430 1435 144 

Arg Tyr Val Pro Phe Phe Gin His Gly Tyr Ala Asp Tyr Val Glu Leu 
1445 1450 1455 

35 Arg Asp Arg Leu Asp Ala lie Arg Ala Asp Val His Arg Ala Leu Gly 
1460 1465 1470 

Gly Val Pro Leu Asp Leu Ala Ala Ala Ala Glu Gin He Ser Ala Ala 

1475 1480 1485 

Arg Asn Asp Pro Glu Ala Thr Ala Glu Leu Val Arg Thr Gly Val Thr 
40 1490 1495 1500 

Leu Pro Cys Pro Ser Glu Asp Ala Leu Val Ala Cys Ala Ala Ala Leu 
1505 1510 1515 152 

Glu Arg Val Asp Gin Ser Pro Val Lys Asn Thr Ala Tyr Ala Glu Tyr 
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1525 1530 1535 

Val Ala Phe Val Thr Arg Gin Asp Thr Ala Glu Thr Lys Asp Ala Val 

1540 1545 1550 

Val Arg Ala Lys Gin Gin Arg Ala Glu Ala Thr Glu Arg Val Met Ala 
5 1555 1560 1565 

Gly Leu Arg Glu Ala Ala Arg Glu Arg Arg Ala Gin He Glu Ala Glu 

1570 1575 1580 

Gly Leu Ala Asn Leu Lys Thr Met Leu Lys Val Val Ala Val Pro Ala 
1585 1590 1595 160 

10 Thr Val Ala Lys Thr Leu Asp Gin Ala Arg Ser Val Ala Glu He Ala 

1605 1610 1615 

Asp Gin Val Glu Val Leu Leu Asp Gin Thr Glu Lys Thr Arg Glu Leu 

1620 1625 1630 

Asp Val Pro Ala Val He Trp Leu Glu His Ala Gin Arg Thr Phe Glu 
15 1635 1640 1645 

Thr His Pro Leu Ser Ala Arg Asp Gly Pro Gly Pro Leu Ala Arg His 

1650 1655 1660 

Ala Gly Arg Leu Gly Ala Leu Phe Asp Thr Arg Arg Arg Val Asp Ala 
1665 1670 1675 168 

20 Leu Arg Arg Ser Leu Glu Glu Ala Glu Ala Glu Trp Asp Glu Val Trp 

1685 1690 1695 

Gly Arg Phe Gly Arg Val Arg Gly Gly Ala Trp Lys Ser Pro Glu Gly 

1700 1705 1710 

Phe Arg Ala Met His Glu Gin Leu Arg Ala Leu Gin Asp Thr Thr Asn 
25 1715 1720 1725 

Thr Val Ser Gly Leu Arg Ala Gin Pro Ala Tyr Glu Arg Leu Ser Ala 

1730 1735 1740 

Arg Tyr Gin Gly Val Leu Gly Ala Lys Gly Ala Glu Arg Ala Glu Ala 
1745 1750 1755 176 

30 Val Glu Glu Leu Gly Ala Arg Val Thr Lys His Thr Ala Leu Cys Ala 

1765 1770 1775 

Arg Leu Arg Asp Glu Val Val Arg Arg Val Pro Trp Glu Met Asn Phe 

1780 1785 1790 

Asp Ala Leu Gly Arg Leu Leu Ala Glu Phe Asp Ala Ala Ala Ala Asp 
35 1795 1800 1805 

Leu Ala Pro Trp Ala Val Glu Glu Phe Arg Gly Ala Arg Glu Leu He 

1810 1815 1820 

Gin Arg Arg Met Gly Ser Ala Tyr Ala Arg Ala Gly Gly Gin Thr Gly 
1825 1830 1835 184 

40 Ala Gly Ala Ala Ala Ala Pro Ala Pro Leu Leu Val Asp Leu Arg Ala 

1845 1850 1855 

Leu Asp Ala Arg Ala Arg Ala Ser Ser Ser Pro Glu Gly His Glu Val 
1860 1865 1870 
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Asp Pro Gin Leu Leu Arg Arg Arg Gly Glu Ala Tyr Leu Arg Ala Gly 

1875 1880 1885 

Gly Asp Pro Gly Pro Leu Val Leu Arg Glu Ala Val Ser Ala Leu Asp 
1890 1895 1900 

5 Leu Pro Phe Ala Thr Ser Phe Leu Ala Pro Asp Gly Thr Pro Leu Gin 
1905 1910 1915 192 

Tyr Ala Leu Cys Phe Pro Ala Val Thr Asp Lys Leu Gly Ala Leu Leu 

1925 1930 1935 

Met Arg Pro Glu Ala Ala Cys Val Arg Pro Pro Leu Pro Thr Asp Val 
10 1940 1945 1950 

Leu Glu Ser Ala Pro Thr Val Thr Ala Met Tyr Val Leu Thr Val Val 

1955 1960 1965 

Asn Arg Leu Gin Leu Ala Leu Ser Asp Ala Gin Ala Ala Asn Phe Gin 
1970 1975 1980 

15 Leu Phe Gly Arg Phe Val Arg His Arg Gin Ala Thr Trp Gly Ala Ser 
1985 1990 1995 200 

Met Asp Ala Ala Ala Glu Leu Tyr Val Val Ala Thr Thr Leu Thr Arg 

2005 2010 2015 

Glu Phe Gly Cys Arg Trp Ala Gin Leu Gly Trp Ala Ser Gly Ala Ala 
20 2020 2025 2030 

Ala Pro Arg Pro Pro Pro Gly Pro Arg Gly Ser Gin Arg His Cys Val 

2035 2040 2045 

Ala Phe Asn Glu Asn Asp Val Leu Val Val Ala Gly Val Pro Glu His 
2050 2055 2060 

25 He Tyr Asn Phe Trp Arg Leu Asp Leu Val Arg Gin His Glu Tyr Met 
2065 2070 2075 208 

His Leu Thr Leu Glu Arg Ala Phe Glu Asp Ala Ala Glu Ser Met Leu 

2085 2090 2095 

Phe Val Gin Arg Leu Thr Pro His Pro Asp Ala Arg He Arg Val Leu 
30 2100 2105 2110 

Pro Thr Phe Leu Asp Gly Gly Pro Pro Thr Arg Gly Leu Leu Phe Gly 

2115 2120 2125. 

Thr Arg Leu Ala Asp Trp Arg Arg Gly Lys Leu Ser Glu Thr Asp Pro 
2130 2135 2140 

35 Leu Ala Pro Trp Arg Ser Ala Leu Glu Leu Gly Thr Gin Arg Arg Asp 
2145 2150 2155 216 

Ala Pro Ala Leu Gly Lys Leu Ser Pro Ala Gin Ala Ala Val Ser Val 

2165 2170 2175 

Leu Gly Arg Met Cys Leu Pro Ser Ala Ala Ala Leu Trp Thr Cys Met 
40 2180 2185 2190 

Phe Pro Asp Asp Tyr Thr Glu Tyr Asp Ser Phe Asp Ala Leu Leu Ala 

2195 2200 2205 

Ala Arg Leu Glu Ser Gly Gin Thr Leu Gly Pro Ala Gly Gly Arg Glu 
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2210 2215 2220 

Ala Ser Leu Pro Glu Ala Pro His Ala Leu Tyr Arg Pro Thr Gly Gin 
2225 2230 _ 2235 224 

His Val Ala Val Leu Ala Ala Ala Thr Thr Pro Ala Ala Arg Val Thr 
5 2245 2250 2255 

Ala Met Asp Leu Val Leu Ala Ala Val Leu Leu Gly Ala Pro Val Val 

2260 2265 2270 

Val Arg Asn Thr Thr Ala Phe Ser Arg Glu Ser Glu Leu Glu Leu Cys 
2275 2280 2285 

10 Leu Thr Leu Phe Asp Ser Arg Pro Gly Gly Pro Asp Ala Ala Leu Arg 
2290 2295 2300 

Asp Val Val Ser Ser Asp lie Glu Thr Trp Ala Val Gly Leu Leu His 
2305 2310 2315 232 

Thr Asp Leu Asn Pro lie Glu Asn Ala. Cys Leu Ala Ala Gin Leu Pro 
15 2325 2330 2335 

Arg Leu Ser Ala Leu lie Ala Glu Arg Pro Leu Ala Asp Gly Pro Pro 

2340 2345 2350 

Cys Leu Val Leu Val Asp lie Ser Met Thr Pro Val Ala Val Leu Trp 
2355 2360 2365 

20 Glu Ala Pro Glu Pro Pro Gly Pro Pro Asp Val Arg Phe Val Gly Ser 
2370 2375 2380 

Glu Ala Thr Glu Glu Leu Pro Phe Val Ala Thr Ala Gly Asp Val Leu 
2385 2390 2395 240 

Ala Ala Ser Ala Ala Asp Ala Asp Pro Phe Phe Ala Arg Ala lie Leu 
25 2405 2410 2415 

Gly Arg Pro Phe Asp Ala Ser Leu Leu Thr Gly Glu Leu Phe Pro Gly 

2420 2425 2430 

His Pro Val Tyr Gin Arg Pro Leu Ala Asp Glu Ala Gly Pro Ser Ala 
2435 2440 2445 

30 Pro Thr Ala Ala Arg Asp Pro Arg Asp Leu Ala Gly Gly Asp Gly Gly 
2450 2455 2460 

Ser Gly Pro Glu Asp Pro Ala Ala Pro Pro Ala Arg Gin Ala Asp Pro 
2465 2470 2475 248 

Gly Val Leu Ala Pro Thr Leu Leu Thr Asp Ala Thr Thr Gly Glu Pro 
35 2485 2490 2495 

Val Pro Pro Arg Met Trp Ala Trp lie His Gly Leu Glu Glu Leu Ala 

2500 2505 2510 

Ser Asp Asp Ala Gly Gly Pro Thr Pro Asn Pro Ala Pro Ala Leu Leu 
2515 2520 2525 

40 Pro Pro Pro Ala Thr Asp Gin Ser Val Pro Thr Ser Gin Tyr Ala Pro 
2530 2535 2540 

Arg Pro He Gly Pro Ala Ala Thr Ala Arg Glu Trp Ser Val Pro Pro 
2545 2550 2555 256 
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Gin Gin Asn Thr Gly Arg Val Pro Val Ala Pro Arg Asp Asp Pro Arg 

2565 2570 2575 

Pro Ser Pro Pro Thr Pro Ser:. Pro Pro Ala Asp Ala Ala Leu Pro Pro 
2580 2585 2590 

5 Pro Ala Phe Ser Gly Ser Ala Ala Ala Phe Ser Ala Ala Val Pro Arg 
2595 2600 2605 

Val Arg Arg Ser Arg Arg Thr Arg Ala Lys Ser Arg Ala Pro Arg Ala 

2610 2615 2620 

Ser Ala Pro Pro Glu Gly Trp Arg Pro Pro Ala Leu Pro Ala Pro Val 
10 2625 2630 2635 264 

Ala Pro Val Ala Ala Ser Ala Arg Pro Pro Asp Gin Pro Pro Thr Pro 

2645 2650 2655 

Glu Ser Ala Pro Pro Ala Trp Val Ser Ala Leu Pro Leu Pro Pro Gly 
2660 2665 2670 

15 Pro Ala Ser Arg Ala Phe Pro Ala Pro Thr Leu Ala Pro lie Pro Pro 
2675 2680 2685 

Pro Pro Ala Glu Gly Ala Val Ala Pro Gly Asp Asp Arg Arg Arg Gly 

2690 2695 2700 

Arg Arg Gin Thr Thr Ala Gly Pro Ser Pro Thr Pro Pro Arg Gly Pro 
20 2705 2710 2715 272 

Ala Ala Gly Pro Pro Arg Arg Leu Trp Ala Val Ala Ser Leu Ser Ala 

2725 2730 2735 

Ser Leu Asn Ser Leu Pro Ser Pro Arg Asp Pro Ala Asp His Ala Ala 
2740 2745 2750 

25 Ala Val Ser Ala Ala Ala Ala Ala Val Pro Pro Ser Pro Gly Leu Ala 
2755 2760 2765 

Pro Pro Thr Ser Ala Val Gin Thr Ser Pro Pro Pro Leu Ala Pro Gly 

2770 ' ' 2775 2780 

Pro Val Ala Pro Ser Glu Pro Leu Cys Gly Trp Val Val Pro Gly Gly 
30 2785 2790 2795 280 

Pro Val Ala Arg Arg Pro Pro Pro Gin Ser Pro Ala Thr Lys Pro Ala 

2805 2810 2815 

Ala Arg Thr Arg lie Arg Ala Arg Ser Val Pro Gin Pro Pro Leu Pro 
2820 2825 2830 

35 Gin Pro Pro Leu Pro Gin Pro Pro Leu Pro Gin Pro Pro Leu Pro Gin 
2835 2840 2845 

Pro Pro Leu Pro Gin Pro Pro Leu Pro Gin Pro Pro Leu Pro Gin Pro 

2850 2855 2860 

Pro Leu Pro Gin Pro Pro Leu Pro Gin Pro Pro Leu Pro Gin Pro Pro 
40 2865 2870 2875 288 

Leu Pro Gin Pro Pro Leu Pro Gin Ser Arg Asp Ser Val Pro Thr Pro 

2885 2890 2895 

Glu Ser Pro Thr His Thr Asn Thr His Leu Pro Val Ser Ala Val Thr 
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2900 2905 2910 

Ser Trp Ala Ser Ser Leu Ala Leu His Val Asp Ser Ala Pro Pro Pro 

2915 _ 2920 2925 

Ala Ser Leu Leu Gin Thr Leu His Ser Asp Asp Glu His Ser Asp Ala 
5 2930 2935 2940 

Asp Ser Leu Arg Phe Ser Asp Ser Asp Asp Thr Glu Ala Leu Asp Pro 
2945 2950 2955 296 

Leu Pro Pro Glu Pro His Leu Pro Pro Ala Asp Glu Pro Pro Gly Pro 
2965 2970 2975 

10 Leu Ala Ala Asp His Leu Gin Ser Pro His Ser Gin Phe Gly Pro Leu 
2980 2985 . 2990 

Pro Val Gin Ala Asn Ala Val Leu Ser Arg Arg Tyr Val Arg Ser Thr 

2995 3000 3005 

Gly Arg Ser Ala Val Leu lie Arg Ala Cys Arg Arg lie Gin Gin Gin 
15 3010 3015 3020 

Leu Gin Arg Thr Arg Arg Ala Leu Phe Gin Arg Ser Asn Ala Val Leu 
3025 3030 3035 304 

Thr Ser Leu His His Val Arg Met Leu Leu Gly 
3045 3050 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 253: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1124 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 



Met Ala Asn Arg Pro Ala Ala Ser Ala Gly Ala Arg Ser Pro Ser Gin 
1 5 10 15 

35 Glu Pro Arg Glu Pro Glu Val Ala Pro Pro Gly Gly Asp His Val Phe 
20 25 30 

Cys Arg Lys Val Ser Gly Val Met Val Leu Ser Ser Asp Pro Pro Gly 

35 40 45 

Pro Ala Ala Tyr Arg lie Ser Asp Ser Ser Phe Val Gin Cys Gly Ser 
40 50 55 .60 

Asn Cys Ser Met lie lie Asp Gly Asp Val Arg His Leu Arg Asp Leu 
65 70 75 80 

Glu Gly Ala Thr Ser Thr Gly Ala Phe Val Ala lie Ser Asn Val Ala 
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85 90 95 

Ala Gly Gly Asp Gly Arg Thr Ala Val Val Gly Gly Thr Ser Gly Pro 

100 105 110 

Ser Ala Thr Thr Ser Val Gly Thr Gin Thr Ser Gly Glu Phe Leu His 
5 115 120 125 

Gly Asn Pro Arg Thr Pro Glu Pro Gin Gly Pro Gin Ala Val Pro Pro 

130 135 140 

Pro Pro Pro Pro Pro Phe Pro Trp Gly His Glu Cys Cys Ala Arg Arg 
145 150 155 160 

10 Asp Arg Gly Ala Glu Lys Asp Val Gly Ala Ala Glu Ser Trp Ser Asp 

165 170 175 

Gly Pro Ser Ser Asp Ser Glu Thr Glu Asp Ser Asp Ser Ser Asp Glu 

180 185 190 

Asp Thr Gly Ser Gly Ser Glu Thr Leu Ser Arg Ser Ser Ser lie Trp 
15 195 200 205 

Ala Ala Gly Ala Thr Asp Asp Asp Asp Ser Asp Ser Asp Ser Arg Ser 

210 215 220 

Asp Asp Ser Val Gin Pro Asp Val Val Val Arg Arg Arg Trp Ser Asp 
225 230 235 240 

20 Gly Pro Ala Pro Val Ala Phe Pro Lys Pro Arg Arg Pro Gly Asp Ser 

245 250 255 

Pro Gly Asn Pro Gly Leu Gly Ala Gly Thr Gly Pro Gly Ser Ala Thr 

260 265 270 

Asp Pro Arg Ala Ser Ala Asp Ser Asp Ser Ala Ala His Ala Ala Ala 
25 275 280 285 

Pro Gin Ala Asp Val Ala Pro Val Leu Asp Ser Gin Pro Thr Val Gly 

290 295 300 

Thr Asp Pro Gly Tyr Pro Val Pro Leu Glu Leu Thr Pro Glu Asn Ala 
305 310 315 320 

30 Glu Ala Val Ala Arg Phe Leu Gly Asp Ala Val Asp Arg Glu Pro Ala 

325 330 335 

Leu Met Leu Glu Tyr Phe Cys Arg Cys Ala Arg Glu Glu Ser Lys Arg 

340 345 350 

Val Pro Pro Arg Thr Phe Gly Ser Ala Pro Arg Leu Thr Glu Asp Asp 
35 355 360 365 

Phe Gly Leu Leu Asn Tyr Ala Glu Met Arg Arg Leu Cys Leu Asp Leu 

370 375 380 

Pro Pro Val Pro Pro Asn Ala Tyr Thr Pro Tyr His Leu Arg Glu Tyr 
385 390 395 400 

40 Ala Thr Arg Leu Val Asn Gly Phe Lys Pro Leu Val Arg Arg Ser Ala 

405 410 415 

Arg Leu Tyr Arg He Leu Gly He Leu Val His Leu Arg He Arg Thr 
420 425 430 
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Arg Glu Ala Ser Phe Glu Glu Trp Met Arg Ser Lys Glu Val Asp Leu 

435 440 445 

Asp Phe Gly Leu Thr Glu Arg Leu Arg Glu His Glu Ala Gin Leu Met 
450 455 460 

5 lie Leu Ala Gin Ala Leu Asn Pro Tyr Asp Cys Leu lie His Ser Thr 
465 470 475 480 

Pro Asn Thr Leu Val Glu Arg Gly Leu Gin Ser Ala Leu Lys Tyr Glu 

485 490 495 

Glu Phe Tyr Leu Lys Arg Phe Gly Gly His Tyr Met Glu Ser Val Phe 
10 500 505 510 

Gin Met Tyr Thr Arg lie Ala Gly Phe Leu Ala Cys Arg Ala Thr Arg 

515 520 525 

Gly Met Arg His lie Ala Leu Gly Arg Gin Gly Ser Trp Trp Glu Met 
530 535 540 

15 Phe Lys Phe Phe Phe His Arg Leu Tyr Asp His Gin lie Val Pro Ser 
545 550 555 560 

Thr Pro Ala Met Leu Asn Leu Gly Thr Arg Asn Tyr Tyr Thr Ser Ser 

565 570 575 

Cys Tyr Leu Val Asn Pro Gin Ala Thr Thr Asn Gin Ala Thr Leu Arg 
20 580 585 590 

Ala He Thr Gly Asn Val Ser Ala He Leu Ala Arg Asn Gly Gly He 

595 600 605 

Gly Leu Cys Met Gin Ala Phe Asn Asp Asp Gly Thr Ala Ser He Met 
610 615 620 

25 Pro Ala Leu Lys Val Leu Asp Ser Leu Val Ala Ala His Asn Lys Gin 
625 630 635 640 

Ser Trp Thr Gly Ala Cys Val Tyr Leu Glu Pro Trp His Ser Asp Val 

645 650 655 

Arg Ala Val Leu Arg Met Lys Gly Val Leu Ala Gly Glu Glu Ala Gin 
30 660 665 670 

Arg Cys Asp Asn He Phe Ser Ala Leu Trp Met Pro Asp Leu Phe Phe 

675 680 685 

Lys Arg Leu He Arg His Leu Asp Gly Glu Lys Asn Val Thr Trp Ser 
690 695 700 

35 Leu Phe Asp Arg Asp Thr Ser Met Ser Leu Ala Asp Phe His Gly Glu 
705 710 715 720 

Glu Phe Glu Lys Leu Tyr Glu His Leu Glu Ala Met Gly Phe Gly Glu 

725 730 735 

Thr He Pro He Gin Asp Leu Ala Tyr Ala He Val Arg Ser Ala Ala 
40 740 745 750 

Thr Thr Gly Ser Pro Phe He Met Phe Lys Asp Ala Val Asn Arg His 

755 760 765 

Tyr He Tyr Asp Thr Gin Gly Ala Ala He Ala Gly Ser Asn Leu Cys 
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770 775 780 

Thr Glu lie Val His Pro Ser Ser Lys Arg Ser Ser Gly Val Cys Asn 
785 790 795 800 

Leu Gly Ser Val Asn Leu Ala Arg Cys Val Ser Arg Arg Thr Phe Asp 
5 805 810 815 

Phe Gly Met Leu Arg Asp Ala Val Gin Ala Cys Val Leu Met Val Asn 

820 825 830 

He Met He Asp Ser Thr Leu Gin Pro Thr Pro . Gin Cys Arg His Asp 
835 840 845 

10 Asn Leu Arg Ser Met Gly He Gly Met Gin Gly Leu His Thr Ala Cys 
850 855 860 

Leu Lys Met Gly Leu Asp Leu Glu Ser Ala Glu Phe Arg Asp Leu Asn 
865 870 .875 880 

Thr His He Ala Glu Val Met Leu Leu Ala Ala Met Lys Thr Ser Asn 
15 885 890 895 

Ala Leu Cys Val Arg Gly Ala Arg Pro Phe Ser His Phe Lys Arg Ser 

900 905 910 

Met Tyr Arg Ala Gly Arg Phe His Trp Glu Arg Phe Ser Asn Asp Arg 
915 920 925 

20 Tyr Glu Gly Glu Trp Glu Met Leu Arg Gin Ser Met Met Lys His Gly 
930 935 940 

Leu Arg Asn Ser Gin Phe He Ala Leu Met Pro Thr Ala Ala Ser Ala 
945 950 955 960 

Gin He Ser Asp Val Ser Glu Gly Phe Ala Pro Leu Phe Thr Asn Leu 
25 965 970 975 

Phe Ser Lys Val Thr Arg Asp Gly Glu Thr Leu Arg Pro Asn Thr Leu 

980 985 990 

Leu Leu Lys Glu Leu Glu Arg Thr Phe Gly Gly Lys Arg Leu Leu Asp 
995 1000 1005 

30 Ala Met Asp Gly Leu Glu Ala Lys Gin Trp Ser Val Ala Gin Ala Leu 
1010 1015 1020 

Pro Cys Leu Asp Pro Ala His Pro Leu Arg Arg Phe Lys Thr Ala Phe 
1025 1030 1035 104 

Asp Tyr Asp Gin Glu Leu Leu He Asp Leu Cys Ala Asp Arg Ala Pro 
35 1045 1050 1055 

Tyr Val Asp His Ser Gin Ser Met Thr Leu Tyr Val Thr Glu Lys Ala 

1060 1065 1070 

Asp Gly Thr Leu Pro Ala Ser Thr Leu Val Arg Leu Leu Val His Ala 
1075 1080 1085 

40 Tyr Lys Arg Gly Leu Lys Thr Gly Met Tyr Tyr Cys Lys Val Arg Lys 
1090 1095 1100 

Ala Thr Asn Ser Gly Val Phe Ala Gly Asp Asp Asn He Val Cys Thr 
1105 1110 1115 112 
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Ser Cys Ala Leu 



(2) INFORMATION FOR SEQ ID NO: 2 54: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1124 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 

15 

Met Ala Asn Arg Pro Ala Ala Ser Ala Gly Ala Arg Ser Pro Ser Gin 

15 10 15 

Glu Pro Arg Glu Pro Glu Val Ala Pro Pro Gly Gly Asp His Val Phe 
20 25 30 

20 Cys Arg Lys Val Ser Gly Val Met Val Leu Ser Ser Asp Pro Pro Gly 
35 40 45 

Pro Ala Ala Tyr Arg lie Ser Asp Ser Ser Phe Val Gin Cys Gly Ser 

50 55 60 

Asn Cys Ser Met lie lie Asp Gly Asp Val Arg His Leu Arg Asp Leu 
25 65 70 75 80 

Glu Gly Ala Thr Ser Thr Gly Ala Phe Val Ala He Ser Asn Val Ala 

85 90 95 

Ala Gly Gly Asp Gly Arg Thr Ala Val Val Gly Gly Thr Ser Gly Pro 
100 105 110 

30 Ser Ala Thr Thr Ser Val Gly Thr Gin Thr Ser Gly Glu Phe Leu His 
115 120 125 

Gly Asn Pro Arg Thr Pro Glu Pro Gin Gly Pro Gin Ala Val Pro Pro 

130 135 140 

Pro Pro Pro Pro Pro Phe Pro Trp Gly His Glu Cys Cys Ala Arg Arg 
35 145 150 155 160 

Asp Arg Gly Ala Glu Lys Asp Val Gly Ala Ala Glu Ser Trp Ser Asp 

165 170 175 

Gly Pro Ser Ser Asp Ser Glu Thr Glu Asp Ser Asp Ser Ser Asp Glu 
180 185 190 

40 Asp Thr Gly Ser Gly Ser Glu Thr Leu Ser Arg Ser Ser Ser He Trp 
195 200 205 

Ala Ala Gly Ala Thr Asp Asp Asp Asp Ser Asp Ser Asp Ser Arg Ser 
210 215 220 
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Asp Asp Ser Val Gin Pro Asp Val Val Val Arg Arg Arg Trp Ser Asp 
225 230 235 240 

Gly Pro Ala Pro Val Ala Phe .Pro Lys Pro Arg Arg Pro Gly Asp Ser 
245 250 255 

5 Pro Gly Asn Pro Gly Leu Gly Ala Gly Thr Gly Pro Gly Ser Ala Thr 
260 265 270 

Asp Pro Arg Ala Ser Ala Asp Ser Asp Ser Ala Ala His Ala Ala Ala 

275 280 285 

Pro Gin Ala Asp Val Ala Pro Val Leu Asp Ser Gin Pro Thr Val Gly 
10 290 295 300 

Thr Asp Pro Gly Tyr Pro Val Pro Leu Glu Leu Thr Pro Glu Asn Ala 
305 310 315 320 

Glu Ala Val Ala Arg Phe Leu Gly Asp Ala Val Asp Arg Glu Pro Ala 
325 330 335 

15 Leu Met Leu Glu Tyr Phe Cys Arg Cys Ala Arg Glu Glu Ser Lys Arg 
340 345 350 

Val Pro Pro Arg Thr Phe Gly Ser Ala Pro Arg Leu Thr Glu Asp Asp 

355 360 365 

Phe Gly Leu Leu Asn Tyr Ala Glu Met Arg Arg Leu Cys Leu Asp Leu 
20 370 375 380 

Pro Pro Val Pro Pro Asn Ala Tyr Thr Pro Tyr His Leu Arg Glu Tyr 
385 390 395 400 

Ala Thr Arg Leu Val Asn Gly Phe Lys Pro Leu Val Arg Arg Ser Ala 
405 410 415 

25 Arg Leu Tyr Arg lie Leu Gly lie Leu Val His Leu Arg lie Arg Thr 
420 425 430 

Arg Glu Ala Ser Phe Glu Glu Trp Met Arg Ser Lys Glu Val Asp Leu 

435 440 445 

Asp Phe Gly Leu Thr Glu Arg Leu Arg Glu His Glu Ala Gin Leu Met 
30 450 455 460 

lie Leu Ala Gin Ala Leu Asn Pro Tyr Asp Cys Leu lie His Ser Thr 
465 470 475 480 

Pro Asn Thr Leu Val Glu Arg Gly Leu Gin Ser Ala Leu Lys Tyr Glu 
485 490 495 

35 Glu Phe Tyr Leu Lys Arg Phe Gly Gly His Tyr Met Glu Ser Val Phe 
500 505 510 

Gin Met Tyr Thr Arg lie Ala Gly Phe Leu Ala Cys Arg Ala Thr Arg 

515 520 525 

Gly Met Arg His lie Ala Leu Gly Arg Gin Gly Ser Trp Trp Glu Met 
40 530 535 540 

Phe Lys Phe Phe Phe His Arg Leu Tyr Asp His Gin lie Val Pro Ser 
545 550 555 560 

Thr Pro Ala Met Leu Asn Leu Gly Thr Arg Asn Tyr Tyr Thr Ser Ser 
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565 570 575 

Cys Tyr Leu Val Asn Pro Gin Ala Thr Thr Asn Gin Ala Thr Leu Arg 

580 585 590 

Ala He Thr Gly Asn Val Ser Ala He Leu Ala Arg Asn Gly Gly He 
5 595 600 605 

Gly Leu Cys Met Gin Ala Phe Asn Asp Asp Gly Thr Ala Ser He Met 

610 615 620 

Pro Ala Leu Lys Val Leu Asp Ser Leu Val Ala Ala His Asn Lys Gin 
625 630 635 640 

10 Ser Trp Thr Gly Ala Cys Val Tyr Leu Glu Pro Trp His Ser Asp Val 

645 650 655 

Arg Ala Val Leu Arg Met Lys Gly Val Leu Ala Gly Glu Glu Ala Gin 

660 665 670 

Arg Cys Asp Asn He Phe Ser Ala Leu Trp Met Pro Asp Leu Phe Phe 
15 675 680 685 

Lys Arg Leu He Arg His Leu Asp Gly Glu Lys Asn Val Thr Trp Ser 

690 695 700 

Leu Phe Asp Arg Asp Thr Ser Met Ser Leu Ala Asp Phe His Gly Glu 
705 710 715 720 

20 Glu Phe Glu Lys Leu Tyr Glu His Leu Glu Ala Met Gly Phe Gly Glu 

725 730 735 

Thr He Pro He Gin Asp Leu Ala Tyr Ala He Val Arg Ser Ala Ala 

740 745 750 

Thr Thr Gly Ser Pro Phe He Met Phe Lys Asp Ala Val Asn Arg His 
25 755 760 765 

Tyr He Tyr Asp Thr Gin Gly Ala Ala lie Ala Gly Ser Asn Leu Cys 

770 775 780 

Thr Glu He Val His Pro Ser Ser Lys Arg Ser Ser Gly Val Cys Asn 
785 790 795 800 

30 Leu Gly Ser Val Asn Leu Ala Arg Cys Val Ser Arg Arg Thr Phe Asp 

805 810 815 

Phe Gly Met Leu Arg Asp Ala Val Gin Ala Cys Val Leu Met Val Asn 

820 825 830 

He Met He Asp Ser Thr Leu Gin Pro Thr Pro Gin Cys Arg His Asp 
35 835 840 845 

Asn Leu Arg Ser Met Gly He Gly Met Gin Gly Leu His Thr Ala Cys 

850 855 860 

Leu Lys Met Gly Leu Asp Leu Glu Ser Ala Glu Phe Arg Asp Leu Asn 
865 870 875 880 

40 Thr His He Ala Glu Val Met Leu Leu Ala Ala Met Lys Thr Ser Asn 

885 890 895 

Ala Leu Cys Val Arg Gly Ala Arg Pro Phe Ser His Phe Lys Arg Ser 
900 905 910 
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Met Tyr Arg Ala Gly Arg Phe His Trp Glu Arg Phe Ser Asn Asp Arg 

915 920 925 

Tyr Glu Gly Glu Trp Glu Met -Leu Arg Gin Ser Met Met Lys His Gly 

930 935 940 

5 Leu Arg Asn Ser Gin Phe lie Ala Leu Met Pro Thr Ala Ala Ser Ala 

945 950 955 960 

Gin lie Ser Asp Val Ser Glu Gly Phe Ala Pro Leu Phe Thr Asn Leu 

965 970 975 

Phe Ser Lys Val Thr Arg Asp Gly Glu Thr Leu Arg Pro Asn Thr Leu 

10 980 985 990 

Leu Leu Lys Glu Leu Glu Arg Thr Phe Gly Gly Lys Arg Leu Leu Asp 

995 1000 1005 
Ala Met Asp Gly Leu Glu Ala Lys Gin Trp Ser Val Ala Gin Ala Leu 

1010 1015 1020 

15 Pro Cys Leu Asp Pro Ala His Pro Leu Arg Arg Phe Lys Thr Ala Phe 

1025 1030 1035 104 

Asp Tyr Asp Gin Glu Leu Leu lie Asp Leu Cys Ala Asp Arg Ala Pro 

1045 1050 1055 

Tyr Val Asp His Ser Gin Ser Met Thr Leu Tyr Val Thr Glu Lys Ala 

20 1060 1065 1070 

Asp Gly Thr Leu Pro Ala Ser Thr Leu Val Arg Leu Leu Val His Ala 

1075 1080 1085 

Tyr Lys Arg Gly Leu Lys Thr Gly Met Tyr Tyr Cys Lys Val Arg Lys 

1090 1095 1100 
25 Ala Thr Asn Ser Gly Val Phe Ala Gly Asp Asp Asn lie Val Cys Thr 

1105 1110 1115 112 
Ser Cys Ala Leu 



30 (2) INFORMATION FOR SEQ ID NO: 255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 



Met Asp Pro Ala Val Ser Pro Ala Ser Thr Asp Pro Leu Asp Thr His 
15 10 15 
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Ala Ser Gly Ala Gly Ala Ala Pro lie Pro Val Cys Pro Thr Pro Glu 

20 25 30 

Arg Tyr Phe Tyr Thr Ser Gin.. Cys Pro Asp lie Asn His Leu Arg Ser 
35 40 45 

5 Leu Ser lie Leu Asn Arg Trp Leu Glu Thr Glu Leu Val Phe Val Gly 
50 55 60 

Asp Glu Glu Asp Val Ser Lys Leu Ser Glu Gly Glu Leu Gly Phe Tyr 
65 70 75 80 

Arg Phe Leu Phe Ala Phe Leu Ser Ala Ala Asp Asp Leu Val Thr Glu 
10 85 90 95 

Asn Leu Gly Gly Leu Ser Gly Leu Phe Glu Gin Lys Asp lie Leu His 

100 105 110 

Tyr Tyr Val Glu Gin Glu Cys He Glu Val Val His Ser Arg Val Tyr 
115 120 125 

15 Asn He He Gin Leu Val Leu Phe His Asn Asn Asp Gin Ala Arg Arg 
130 135 140 

Ala Tyr Val Ala Arg Thr He Asn His Pro Ala He Arg Val Lys Val 
145 150 155 160 

Asp Trp Leu Glu Ala Arg Val Arg Glu Cys Asp Ser He Pro Glu Lys 
20 165 170 175 

Phe He Leu Met He Leu He Glu Gly Val Phe Phe Ala Ala Ser Phe 

180 185 190 

Ala Ala He Ala Tyr Leu Arg Thr Asn Asn Leu Leu Arg Val Thr Cys 
195 200 205 

25 Gin Ser Asn Asp Leu He Ser Arg Asp Glu Ala Val His Thr Thr Ala 
210 215 220 

Ser Cys Tyr lie Tyr Asn Asn Tyr Leu Gly Gly His Ala Lys Pro Glu 
225 230 235 * 240 

Ala Ala Arg Val Tyr Arg Leu Phe Arg Glu Ala Val Asp He Glu He 
30 245 250 255 

Gly Phe He Arg Ser Gin Ala Pro Thr Asp Ser Ser He Leu Ser Pro 

260 265 270 

Gly Ala Ala He Glu Asn Tyr Val Arg Phe Ser Ala Asp Arg Leu Leu 
275 280 285 

35 Gly Leu lie His Met Gin Pro Lys Ala Pro Ala Pro Asp Ala Ser Phe 
290 295 300 

Pro Leu Ser Leu Met Ser Thr Asp Lys His Thr Asn Phe Phe Glu Cys 
305 310 315 320 

Arg Ser Thr Ser Tyr Ala Gly Ala Val Val Asn Asp Leu 
40 325 330 



(2) INFORMATION FOR SEQ ID NO: 256: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 

10 

Met Arg Arg Arg Gly His Ala Phe Ala Pro Gly Asp Arg Gly Thr Arg 

15 10 15 

Ala Ala Gly Pro Gly Pro Ala Ala Pro Trp Gly Ala Pro Ser Lys Pro 
20 25 30 

15 Ala Leu Arg Leu Ala His Leu Phe Cys lie Arg Val Leu Arg Ala Leu 
35 40 45 

Gly Tyr Ala Tyr lie Asn Ser Gly Gin Leu Glu Ala Asp Asp Ala Cys 

50 55 60 

Ala Asn Leu Tyr His Thr Asn Thr Val Ala Tyr Val His Thr Thr Asp 
20 65 70 75 80 

Thr Asp Leu Leu Leu Met Gly Cys Asp lie Val Leu Asp He Ser Thr 

85 90 95 

Gly Tyr He Pro Thr He His Cys Arg Asp Leu Leu Gin Tyr Phe Lys 
100 105 110 

25 Met Ser Tyr Pro Gin Phe Leu Ala Leu Phe Val Arg Cys His Thr Asp 
115 120 125 

Leu His Pro Asn Asn Thr Tyr Ala Ser Val Glu Asp Val Leu Arg Glu 

130 135 140 

Cys His Trp Thr Ala Pro Ser Arg Ser Gin Ala Arg Arg Ala Ala Arg 
30 145 150 155 160 

Arg Glu Arg Ala Asn Ser Arg Ser Leu Glu Ser Met Pro Thr Leu Thr 

165 170 175 

Ala Ala Pro Val Gly Leu Glu Thr Arg lie Ser Trp Thr Glu He Leu 
180 185 190 

35 Ala Gin Gin He Ala Gly Glu Asp Asp Tyr Glu Glu Asp Pro Pro Leu 
195 200 205 

Gin Pro Pro Asp Val Ala Gly Gly Pro Arg Asp Gly Ala Arg Ser Ser 

210 215 220 

Ser Ser Glu He Leu Thr Pro Pro Glu Leu Val Gin Val Pro Asn Ala 
40 225 230 235 240 

Gin Arg Val Ala Glu His Arg Gly Tyr Val Ala Gly Arg Arg Arg His 

245 250 255 

Val He His Asp Ala Pro Glu Ala Leu Asp Trp Leu Pro Asp Pro Met 
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260 265 270 

Thr lie Ala Glu Leu Val Glu His Arg Tyr Val Lys Tyr Val lie Ser 

275 .280 285 

Leu lie Ser Pro Lys Glu Arg Gly Pro Trp Thr Leu Leu Lys Arg Leu 
5 290 295 300 

Pro lie Tyr Gin Asp Leu Arg Asp Glu Asp Leu Ala Arg Ser lie Val 
305 310 315 320 

Thr Arg His lie Thr Ala Pro Asp lie Ala Asp Arg Phe Leu Ala Gin 
325 330 335 

10 Leu Trp Ala His Ala Pro Pro Pro Ala Phe Tyr Lys Asp Val Leu Ala 
340 345 350 

Lys Phe Trp Asp Glu 
355 

15 (2) INFORMATION FOR SEQ ID NO: 2 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 466 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 57: 

Met Ala His Leu Pro Gly Gly Ala Ala Ala Ala Pro Leu Ser Glu Asp 

15 10 15 

Ala lie Pro Ser Pro Arg Glu Arg Thr Glu Asp Trp Pro Pro Cys Gin 
30 20 25 30 

He Val Leu Gin Gly Ala Glu Leu Asn Gly He Leu Gin Ala Phe Ala 

35 40 45 

Pro Leu Arg Thr Ser Leu Leu Asp Ser Leu Leu Val Val Gly Asp Arg 
50 55 60 

35 Gly He Leu Val His Asn Ala He Phe Gly Glu Gin Val Phe Leu Pro 
65 70 75 80 

Leu Asp His Ser Gin Phe Ser Arg Tyr Arg Trp Gly Gly Pro Thr Ala 

85 90 95 

Ala Phe Leu Ser Leu Val Asp Gin Lys Arg Ser Leu Leu Ser Val Phe 
40 100 105 110 

Arg Ala Asn Gin Tyr Pro Asp Leu Arg Arg Val Glu Leu Thr Val Thr 

115 120 125 

Gly Gin Ala Pro Phe Arg Thr Leu Val Gin Arg lie Trp Thr Thr Ala 
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130 135 140 

Ser Asp Gly Glu Ala Val Glu Leu Ala Ser Glu Thr Leu Met Lys Arg 
145 150 155 160 

Glu Leu Thr Ser Phe Ala Val Leu Leu Pro Gin Gly Asp Pro Asp Val 
5 165 170 175 

Gin Leu Arg Leu Thr Lys Pro Gin Leu Thr Lys Val Val Asn Ala Val 

180 185 190 

Gly Asp Glu Thr Ala Lys Pro Thr Thr Phe Glu Leu Gly Pro Asn Gly 
195 200 205 

10 Lys Phe Ser Val Phe Asn Ala Arg Thr Cys Val Thr Phe Ala Ala Arg 
210 215 220 

Glu Glu Gly Ala Ser Ser Ser Thr Ser Ala Gin Val Gin lie Leu Thr 
225 230 235 240 

Ser Ala Leu Lys Lys Ala Gly Gin Ala Ala Ala Asn Ala Lys Thr Val 
15 245 250 255 

Tyr Gly Glu Asn Thr Thr Phe Ser Val Val Val Asp Asp Cys Ser Met 

260 265 270 

Arg Ala Val Leu Arg Arg Leu Gin Val Gly Gly Gly Thr Leu Lys Phe 
275 280 285 

20 Phe Leu Thr Ala Asp Val Pro Ser Val Cys Val Thr Ala Thr Gly Pro 
290 295 300 

Asn Ala Val Ser Ala Val Phe Leu Leu Lys Pro Gin Arg Val Cys Leu 
305 310 315 320 

Asn Trp Leu Gly Arg Thr Pro Gly Ser Ser Thr Gly Ser Leu Ala Ser 
25 325 330 335 

Gin Asp Ser Arg Ala Gly Pro Thr Asp Ser Gin Asp Phe Ser Ser Glu 

340 345 350 

Pro Asp Ala Gly Asp Arg Gly Ala Pro Glu Glu Glu Gly Leu Glu Gly 
355 360 365 

30 Gin Ala Arg Val Pro Pro Ala Phe Pro Glu Pro Pro Gly Thr Lys Arg 
370 375 380 

. Arg His Ala Gly Ala Glu Val Val Pro Ala Asp Asp Ala Thr Lys Arg 
385 390 395 400 

Pro Lys Thr Gly Val Pro Ala Ala Pro Thr Arg Ala Glu Ser Pro Pro 
35 405 410 415 

Leu Ser Ala Arg Tyr Gly Pro Glu Ala Ala Glu Gly Gly Gly Asp Gly 

420 425 430 

Gly Arg Tyr Ala Cys Tyr Phe Arg Asp Leu Gin Thr Gly Asp Asp Ser 
435 440 445 

40 Pro Leu Ser Ala Phe Arg Gly Pro Gin Arg Pro Pro Tyr Gly Phe Gly 
450 455 460 

Leu Pro 
465 
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(2) INFORMATION FOR SEQ ID NO:258: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 170 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: peptide 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 258: 

Met Ala Phe Arg Ala Ser Gly Pro Ala Tyr Gin Pro Leu Ala Pro Ala 
15 1 5 10 15 

Asp Ala Arg Ala Arg Val Pro Ala Val Ala Trp lie Gly Val Gly Ala 

20 25 30 

He Val Gly Ala Phe Ala Leu Val Ala Ala Leu Val Leu Val Pro Pro 
35 40 45 

20 Arg Ser Ser Trp Gly Leu Ser Pro Cys Asp Ser Gly Trp Gin Glu Phe 
50 55 60 

Asn Ala Gly Cys Val Ala Trp Asp Pro Thr Pro Val Glu His Glu Gin 
65 70 75 80 

Ala Val Gly Gly Cys Ser Ala Pro Ala Thr Leu He Pro Arg Ala Ala 
25 85 90 95 

Ala Lys His Leu Ala Ala Leu Thr Arg Val Gin Ala Glu Arg Ser Ser 

100 105 110 

Gly Tyr Trp Trp Val Asn Gly Asp Gly He Arg Thr Cys Leu Arg Leu 
115 120 125 

30 Val Asp Ser Val Ser Gly lie Asp Glu Phe Cys Glu Glu Leu Ala He 
130 135 140 

Arg He Cys Tyr Tyr Pro Arg Ser Pro Gly Gly Phe Val Arg Phe Val 
145 150 155 160 

Thr Ser He Arg Asn Ala Leu Gly Leu Pro 
35 165 170 

(2) INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 713 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

648 



WO 98/20016 



PCT/US97/20016 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:259: 

5 

Met Gin Arg Arg Arg Ala Ser Ser Leu Arg Leu Ala Arg Cys Leu Thr 

15 10 15 

Pro Ala Asn Leu lie Arg Gly Ala Asn Ala Gly Val Pro Glu Arg Arg 
20 25 30 

10 lie Phe Ala Gly Cys Leu Leu Pro Thr Pro Glu Gly Leu Leu Ser Ala 
35 40 45 

Ala Val Gly Val Leu Arg Gin Arg Ala Asp Asp Leu Gin Pro Ala Phe 

50 55 60 

Leu Thr Gly Ala Asp Arg Ser Val Arg Leu Ala Ala Arg His His Asn 
15 65 70 75 80 

Thr Val Pro Glu Ser Leu lie Val Asp Gly Leu Ala Ser Asp Pro His 

85 90 95 

Tyr Asp Tyr lie Arg His Tyr Ala Ser Ala Ala Lys Gin Ala Leu Gly 
100 105 110 

20 Glu Val Glu Leu Ser Gly Gly Gin Leu Ser Arg Ala lie Leu Ala Gin 
115 120 125 

Tyr Trp Lys Tyr Leu Gin Thr Val Val Pro Ser Gly Leu Asp lie Pro 

130 135 140 

Asp Asp Pro Ala Gly Asp Cys Asp Pro Ser Leu His Val Leu Leu Arg 
25 145 150 155 160 

Pro Thr Leu Leu Pro Lys Leu Leu Val Arg Ala Pro Phe Lys Ser Gly 

165 170 175 

Ala Ala Ala Ala Lys Tyr Ala Ala Ala Val Ala Gly Leu Arg Asp Ala 
180 185 190 

30 Ala His Arg Leu Gin Gin Tyr Met Phe Phe Met Arg Pro Ala Asp Pro 
195 200 205 

Ser Arg Pro Ser Thr Asp Thr Ala Leu Arg Leu Ser Glu Phe Leu Ala 

210 215 220 

Tyr Val Ser Val Leu Tyr His Trp Ala Ser Trp Met Leu Trp Thr Ala 
35 225 230 235 240 

Asp Lys Tyr Val Cys Arg Arg Leu Gly Pro Ala Asp Arg Arg Phe Val 

245 250 255 

Ser Gly Ser Leu Glu Ala Pro Ala Glu Thr Phe Ala Arg His Leu Asp 
260 265 270 

40 Arg Gly Pro Ser Gly Thr Thr Gly Ser Met Gin Cys Met Ala Leu Arg 
275 280 285 

Ala Ala Val Ser Asp Val Leu Gly His Leu Thr Arg Leu Ala His Leu 
290 295 300 
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Trp Glu Thr Gly Lys Arg Ser Gly Gly Thr Tyr Gly He Val Asp Ala 
305 310 315 320 

He Val Ser Thr Val Glu Val Leu Ser He Val His His His Ala Gin 
325 330 335 

5 Tyr He He Asn Ala Thr Leu Thr Gly Tyr Val Val Trp Ala Ser Asp 
340 345 350 

Ser Leu Asn Asn Glu Tyr Leu Arg Ala Ala Val Asp Ser Gin Glu Arg 

355 360 365 

Phe Cys Arg Thr Ala Ala Pro Leu Phe Pro Thr Met Thr Ala Pro Ser 
10 370 375 380 

Trp Ala Arg Met Glu Leu Ser He Lys Ser Trp Phe Gly Ala Ala Pro 
385 390 395 400 

Asp Leu Leu Arg Ser Gly Thr Pro Ser Pro His Tyr Glu Ser He Leu 
405 410 415 

15 Arg Leu Ala Ala Ser Gly Pro Pro Gly Gly Arg Gly Ala Val Gly Gly 
420 425 430 

Ser Cys Arg Asp Lys lie Gin Arg Thr Arg Arg Asp Asn Ala Pro Pro 

435 440 445 

Pro Leu Pro Arg Ala Arg Pro His Ser Thr Pro Ala Ala Pro Arg Arg 
20 450 455 460 

Phe Arg Arg His Arg Glu Asp Leu Pro Glu Pro Pro His Val Asp Ala 
465 470 475 480 

Ala Asp Arg Gly Pro Glu Pro Cys Ala Gly Arg Pro Ala Thr Tyr Tyr 
485 490 495 

25 Thr His Met Ala Gly Ala Pro Pro Arg Leu Pro Pro Arg Asn Pro Ala 
500 505 510 

Pro Pro Glu Gin Arg Pro Ala Ala Ala Ala Arg Pro Leu Ala Ala Gin 

5i5 " 520 525 

Arg Glu Ala Ala Gly Val Tyr Asp Ala Val Arg Thr Trp Gly Pro Asp 
30 530 535 540 

Ala Glu Ala Glu Pro Asp Gin Met Glu Asn Thr Tyr Leu Leu Pro Asp 
545 550 555 560 

Asp Asp Ala Ala Met Pro Ala Gly Val Gly Leu Gly Ala Thr Pro Ala 
565 570 575 

35 Ala Asp Thr Thr Ala Ala Ala Trp Pro Ala Glu Ser His Ala Pro Arg 
580 585 590 

Ala Pro Ser Glu Asp Ala Asp Ser He Tyr Glu Ser Val Ser Glu Asp 

595 600 605 

Gly Gly Arg Val Tyr Glu Glu He Pro Trp Val Arg Val Tyr Glu Asn 
40 610 615 620 

He Cys Leu Arg Arg Gin Asp Ala Gly Gly Ala Ala Pro Pro Gly Asp 
625 630 635 640 

Ala Pro Asp Ser Pro Tyr He Glu Ala Glu Asn Pro Leu Tyr Asp Trp 
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10 



20 



645 650 655 

Gly Gly Ser Ala Leu Phe Ser Pro Pro Gly Ala Thr Arg Ala Pro Asp 

660 665 670 

Pro Gly Leu Ser Leu Ser Pro Met Pro Ala Arg Pro Arg Thr Asn Ala 

675 680 685 

Asn Asp Gly Pro Thr Asn Val Ala Ala Leu Ser Ala Leu Leu Thr Lys 

690 695 700 

Leu Lys Arg Gly Arg His Gin Ser His 
705 710 

(2) INFORMATION FOR SEQ ID NO: 260: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 352 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 60: 



Val Gly Ala Ala Ala Val Pro Leu Leu Ser Ala Gly Gly Ala Ala Pro 
1 5 10 15 

25 Pro His Pro Gly Pro Asp Ala Ala Val Phe Arg Ser Ser Leu Gly Ser 
20 25 30 

Leu Leu Tyr Trp Pro Gly Val Arg Ala Leu Leu Gly Arg Asp Cys Arg 

35 40 45 

Val Ala Ala Arg Tyr Ala Gly Arg Met Thr Tyr lie Ala Thr Gly Ala 
30 50 55 60 

Leu Leu Ala Arg Phe Asn Pro Gly Ala Val Lys Cys Val Leu Pro Arg 
65 70 75 80 

Glu Ala Ala Phe Ala Gly Arg Val Leu Asp Val Leu Ala Val Leu Ala 
85 90 95 

35 Glu Gin Thr Val Gin Trp Leu Ser Val Val Val Gly Ala Arg Leu His 
100 105 110 

Pro His Ser Ala His Pro Ala Phe Val Asp Val Glu Gin Glu Ala Leu 

115 120 125 

Phe Arg Ala Leu Pro Leu Gly Ser Pro Gly Val Val Ala Ala Glu His 
40 130 135 140 

Glu Ala Leu Gly Asp Thr Ala Ala Arg Arg Leu Leu Ala Thr Ser Gin 
145 150 155 160 

Ala Val Leu Gly Ala Ala Val Tyr Ala Leu His Thr Ala Thr Val Thr 
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165 170 175 

Leu Lys Tyr Ala Cys Gly Asp Ala Arg Arg Arg Arg Asp His Ala Ala 

180 185 190 

Ala Ala Arg Ala Val Leu Ala Thr Gly Leu lie Leu Gin Arg Leu Leu 
5 195 200 205 

Gly Leu Ala Asp Thr Val Val Ala Cys Val Ala Ala Phe Asp Gly Gly 

210 215 220 

Ser Thr Ala Pro Glu Val Gly Thr Tyr Thr Pro Leu Arg Tyr Ala Cys 
225 230 235 240 

10 Val Leu Arg Ala Thr Gin Pro Leu Tyr Ala Arg Thr Thr Pro Ala Lys 

245 250 255 

Phe Trp Ala Asp Val Arg Ala Ala Ala Glu His Val Asp Leu Arg Pro 

260 265 270 

Ala Ser Ser Ala Pro Arg Ala Pro Val Ser Gly Thr Ala Asp Pro Ala 
15 275 280 285 

Phe Leu Leu Glu Asp Leu Ala Ala Phe Pro Pro Ala Pro Leu Asn Ser 

290 295 300 

Glu Ser Val Leu Gly Pro Arg Val Arg Val Val Asp He Met Ala Gin 
305 310 315 320 

20 Phe Arg Lys Leu Leu Met Gly Asp Glu Glu Thr Ala Ala Leu Arg Ala 

325 330 335 

His Val Ser Gly Arg Arg Ala Thr Gly Leu Gly Gly Pro Pro Arg Pro 
340 345 350 

25 (2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 457 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

Met Ser Val Arg Gly His Ala Val Arg Arg Arg Arg Ala Ser Thr Arg 

15 10 15 

Ser His Ala Pro Ser Ala His Arg Ala Asp Ser Pro Val Glu Asp Glu 
40 20 25 30 

Pro Glu Gly Gly Gly Val Gly Leu Met Gly Tyr Leu Arg Ala Val Phe 

35 40 45 

Asn Val Asp Asp Asp Ser Glu Val Glu Ala Ala Gly Glu Met Ala Ser 
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50 55 60 

Glu Glu Pro Pro Pro Arg Arg Arg Arg Glu Arg His Pro Gly Ser Arg 
65 70 75 80 

Arg Ala Ser Glu Ala Arg Ala Ala Ala Pro Pro Arg Arg Ala Ser Phe 
5 85 90 95 

Pro Arg Pro Arg Ser Val Thr Ala Arg Ser Gin Ser Val Arg Gly Arg 

100 105 110 

Arg Asp Ser Ala He Thr Arg Ala Pro Arg Gly Gly Tyr Leu Gly Pro 
115 120 125 

10 Met Asp Pro Arg Asp Val Leu Gly Arg Val Gly Gly Ser Arg Val Val 
130 135 140 

Pro Ser Pro Leu Phe Leu Asp Glu Leu Ser Tyr Glu Glu Asp Asp Tyr 
145 150 155 160 

Pro Ala Ala Val Ala His Asp Asp Gly Ala Gly Ala Arg Pro Pro Ala 
15 165 170 175 

Thr Val Glu He Leu Ala Gly Arg Val Ser Gly Pro Glu Leu Gin Ala 

180 185 190 

Ala Phe Pro Leu Asp Arg Leu Thr Pro Arg Val Ala Ala Trp Asp Glu 
195 200 205 

20 Ser Val Arg Ser Ala Leu Gly His Pro Ala Gly Phe Tyr Pro Cys Pro 
210 215 220 

Asp Ser Ala Phe Gly Leu Ser Arg Val Gly Val Met His Phe Asp Ala 
225 230 235 240 

Asp Pro Lys Val Phe Phe Arg Gin Thr Leu Gin Gin Gly Glu Ala Trp 
25 245 250 255 

Tyr Val Thr Gly Asp Ala He Leu Asp Leu Thr Asp Arg Arg Ala Lys 

260 265 270 

Thr Ser Pro Ser Arg Ala Met Gly Phe Leu Val Asp Ala He Val Arg 
275 280 285 

30 Val Ala He Asn Gly Trp Val Cys Gly Thr Arg Leu His Thr Glu Gly 
290 295 300 

Ala Arg Leu Gly Ala Arg Arg Gin Gly Gly Arg Ala Pro Thr Ala Val 
305 310 315 320 

Arg Glu Pro His Gly Val Ala Arg Gly Arg Arg Arg Ala Ala Ala Gin 
35 325 330 335 

Arg Gly Arg Gly Arg Ala Pro Pro Pro Arg Pro Arg Arg Arg Gly Leu 

340 345 350 

Ser Gin Phe Ala Gly Val Pro Ala Val Leu Arg Ala Arg Ala Pro Gly 
355 360 365 

40 Ala Arg Leu Ser Arg Gly Arg Pro Leu Arg Gly Ala His Asp Val His 
370 375 380 

Arg His Arg Gly Ser Ala Arg Pro Leu Gin Pro Arg Arg Arg Gin Met 
385 390 395 400 
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Arg Ala Pro Ala Gly Gly Arg Val Cys Gly Ala Arg Pro Gly Arg Ala 

405 410 415 

Gly Gly Pro Gly Gly Ala Asp Gly Pro Val Gly Gly Arg Gly Gly Ala 
420 425 430 

5 Pro Ala Pro Ala Leu Arg Pro Pro Arg Val Cys Gly Arg Gly Ala Gly 
435 440 445 

Gly Ala Val Ser Arg Pro Ala Pro Gly 
450 455 

10 (2) INFORMATION FOR SEQ ID NO:262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 

Met Thr Ser Arg Arg Ser Val Lys Ser Cys Pro Arg Glu Ala Pro Arg 

15 10 15 

Gly Thr His Glu Glu Leu Tyr Tyr Gly Pro Val Ser Pro Ala Asp Pro 
25 20 25 30 

Glu Ser Pro Arg Asp Asp Phe Arg Arg Gly Ala Gly Pro Met Arg Ala 

35 40 45 

Arg Pro Arg Gly Glu Val Arg Phe Leu His Tyr Asp Glu Ala Gly Tyr 
50 55 60 

30 Ala Leu Tyr Arg Asp Ser Ser Ser Ser Glu Asp Asn Asp Glu Ser Arg 
65 70 75 80 

Asp Thr Ala Arg Pro Arg Arg Ser Ala Ser Val Ala Gly Ser His Gly 

85 90 95 

Pro Gly Pro Ala Arg Ala Pro Pro Pro Pro Gly Gly Pro Val Gly Ala 
35 100 105 no 

Gly Gly Arg Ser His Ala Pro Pro Ala Arg Thr Pro Lys Met Thr Arg 

115 120 125 

Gly Ala Pro Lys Ala Pro Ala Thr Pro Ala Thr Asp Pro Arg Arg Arg 
130 135 140 

40 Pro Ala Gin Ala Asp Ser Ala Val Leu Leu Asp Ala Pro Ala Pro Thr 
145 150 155 160 

Ala Ser Gly Arg Thr Lys Thr Pro Ala Gin Gly Leu Ala Lys Lys Leu 
165 170 175 
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His Phe Ser Thr Ala Pro Pro Ser Pro Thr Ala Pro Trp Thr Pro Arg 

180 185 190 

Val Ala Gly Phe Asn Lys Arg Val Phe Cys Ala Ala Val Gly Arg Leu 
195 200 205 

5 Ala Ala Thr His Ala Arg Leu Ala Ala Val Gin Leu Trp Asp Met Ser 
210 215 220 

Arg Pro His Thr Asp Glu Asp Leu Asn Glu Leu Leu Asp Leu Thr Thr 
225 230 235 240 

He Arg Val Thr Val Cys Glu Gly Lys Asn Leu Leu Gin Arg Ala Asn 
10 245 250 255 

Glu Leu Val Asn Pro Asp Ala Ala Gin Asp Val Asp Ala Thr Ala Ala 

260 265 270 

Arg Arg Pro Ala Gly Arg Ala Ala Ala Thr Ala Arg Ala Pro Ala Arg 
275 280 285 

15 Ser Ala Ser Arg Pro Arg Arg Pro Leu Glu 
290 295 

(2) INFORMATION FOR SEQ ID NO: 2 63: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:263: 

30 Val Val Leu Leu Phe Val Val Ala Gly Val Pro Gly Glu Pro Pro Asn 
15 10 15 

Ala Ala Gly Arg Val He Gly Asp Ala Gin Cys Arg Gly Asp Ser Ala 

20 25 30 

Gly Val Val Ser Val Pro Gly Val Leu Val Pro Phe Tyr Leu Gly Met 
35 35 40 45 

Thr Ser Met Gly Val Cys Met He Ala His Val Tyr Gin He Cys Gin 

50 55 60 

Arg Ala Ala Gly Ser Ala 
65 70 



40 



(2) INFORMATION FOR SEQ ID NO: 264: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 363 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



5 



(ii) MOLECULE 



TYPE: peptide 



(xi) SEQUENCE 



DESCRIPTION: SEQ ID NO:264: 



10 Met Ser Gin Trp Gly Pro Arg Ala lie Leu Val Gin Thr Asp Ser Thr 
15 10 15 

Asn Arg Asn Ala Asp Gly Asp Trp Gin Ala Ala Val Ala lie Arg Gly 

20 25 30 

Gly Gly Val Val Gin Leu Asn Met Val Asn Lys Arg Ala Val Asp Phe 
15 35 40 45 

Thr Pro Ala Glu Cys Gly Asp Ser Glu Trp Ala Val Gly Arg Val Ser 

50 55 60 

Leu Gly Leu Arg Met Ala Met Pro Arg Asp Phe Cys Ala He He His 
65 70 75 80 

20 Ala Pro Ala Val Ser Gly Pro Gly Pro His Val Met Leu Gly Leu Val 

85 90 95 

Asp Ser Gly Tyr Arg Gly Thr Val Leu Ala Val Val Val Ala Pro Asn 

100 105 HO 

Gly Thr Arg Gly Phe Ala Pro Gly Ala Leu Arg Val Asp Val Thr Phe 
25 115 120 125 

Leu Asp He Arg Ala Thr Pro Pro Thr Leu Thr Glu Pro Ser Ser Leu 

130 135 140 

His Arg Phe Pro Gin Leu Ala Pro Ser Pro Leu Ala Gly Leu Arg Glu 
145 150 155 160 

30 Asp Pro Trp Leu Asp Gly Ala Thr Ala Gly Gly Ala Val Pro Ala Arg 

165 170 175 

Arg Arg Gly Gly Ser Leu Val Tyr Ala Gly Glu Leu Thr Gin Val Thr 

180 185 190 

Thr Glu His Gly Asp Cys Val His Glu Ala Pro Ala Phe Leu Pro Lys 
35 195 200 205 

Arg Glu Glu Asp Ala Gly Phe Asp He Leu He His Arg Ala Val Thr 

210 215 220 

Val Pro Ala Asn Gly Ala Thr Val He Gin Pro Ser Leu Arg Val Leu 
225 230 235 240 

40 Arg Ala Ala Asp Gly Pro Glu Ala Cys Tyr Val Leu Gly Arg Ser Ser 

245 250 255 

Leu Asn Arg Leu Leu Val Met Pro Thr Arg Trp Pro Ser Gly His Ala 



260 



265 



270 
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Cys Ala Phe Val Val Cys Asn Leu Thr Gly Val Pro Val Thr Leu Gin 

275 280 285 

Ala Gly Ser Lys Val Ala Gin Leu Leu Val Ala Gly Thr His Ala Leu 
290 295 300 

5 Pro Trp He Pro Pro Asp Asn He His Glu Asp Gly Ala Phe Arg Ala. 
305 310 315 320 

Tyr Pro Arg Gly Val Pro Asp Ala Thr Ala Thr Pro Arg Asp Pro Pro 

325 330 335 

He Leu Val Phe Thr Asn Glu phe Asp Ala Asp Ala Pro Pro Ser Lys 
10 340 345 350 

Arg Gly Ala Gly Gly Phe Gly Ser Thr Gly He 
355 360 



15 



(2) INFORMATION FOR SEQ ID NO; 2 65: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 265: 

25 

Met Ala Ser Leu Leu Gly Val Leu Cys Gly Trp Gly Trp Glu Glu Gin 

1 5 10 15 

Gin Tyr Glu Met He Arg Ala Ala Ala Pro Pro Ser Xaa Xaa Asp Pro 
20 25 30 

30 Arg Leu Gin Glu Ala Val Val Asn Ala Leu Leii Pro Ala Pro He Thr 
35 40 45 

Leu Asp Asp Ala Leu Glu Ser Leu Asp Asp Thr Arg Arg Leu Val Lys 

50 55 60 

Ala Arg Ala Arg Thr Tyr His Ala Cys Met Val Asn Leu Glu Arg Leu 
35 65 70 75 80 

Ala Arg His His Pro Gly Leu Glu Gly Ser Thr He Asp Gly Ala Val 

85 90 95 

Ala Ala His Arg Asp Lys Met Arg Arg Leu Ala Asp Thr Cys Met Ala 
100 105 HO 

40 Thr He Leu Gin Met Tyr Met Ser Val Gly Ala Ala Asp Lys Ser Ala 
115 120 125 

Asp Val Leu Val Ser Gin Ala He Arg Ser Met Ala Glu' Ser Asp Val 
130 135 140 
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Val Met Glu Asp Val Ala lie Ala Glu Arg Ala Leu Gly Leu Ser Thr 
145 150 155 160 

Ser Ala Gly Gly Thr Arg Thr Ala Gly Leu Gly Ala Thr Glu Ala Pro 
165 170 175 

5 Pro Gly Pro Thr Arg Ala Gin Ala Pro Glu Val Ala Ser Val Pro Val 
180 185 190 

Thr His Ala Gly Asp Arg Ser Pro Val Arg Pro Gly Pro Val Pro Pro 

195 200 205 

Ala Asp Pro Thr Pro Asp Pro Arg His Arg Thr Ser Ala Pro Lys Arg 
10 210 215 220 

Gin Ala Ser Ser Thr Glu Ala Pro Leu Leu Leu Ala 
225 230 235 



15 



(2) INFORMATION FOR SEQ ID NO: 2 66: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 453 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

25 

Met Tyr Val Asn Arg Asn Glu lie Phe Asn Ala Ala Val Thr Asn lie 

15 10 15 

He Leu Asp Leu Asp He Ala Leu Lys Glu Pro Val Pro Phe Pro Arg 
20 25 30 

30 Leu His Glu Ala Leu Gly His Phe Arg Arg Gly Ala Ala Val Gin Leu 
35 40 45 

Leu Phe Pro Ala Ala Arg Val Asp Pro Asp Ala Tyr Pro Cys Tyr Phe 

50 55 60 

Phe Lys Ser Ala Cys Arg Pro Arg Ala Pro Pro Val Cys Ala Gly Asp 
35 65 70 75 80 

Gly Pro Ser Ala Gly Gly Asp Asp Gly Asp Gly Asp Trp Phe Pro Asp 

85 90 95 

Ala Gly Gly Asp Asp Gly Asp Glu Glu Trp Glu Glu Asp Thr Asp Pro 
100 105 HO 

40 Met Asp Thr Thr His Gly Pro Leu Pro Asp Asp Glu Ala Ala Tyr Leu 
115 120 125 

Asp Leu Leu His Glu Gin He Pro Ala Ala Thr Pro Ser Glu Pro Asp 
130 135 140 
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Ser Val Val Cys Ser Cys Ala Asp Lys He Gly Leu Arg Val Cys Leu 
145 150 155 160 

Pro Val Pro Ala Pro Tyr Val Val His Gly Ser Leu Thr Met Arg Gly 
165 170 175 

5 Val Ala Arg Val He Gin Gin Ala Val Leu Leu Asp Arg Asp Phe Val 
180 185 190 

Glu Ala Val Gly Ser His Val Lys Asn Phe Leu Leu He Asp Thr Gly 

195 200 205 

Val Tyr Ala His Gly His Ser Leu Arg Leu Pro Tyr Phe Ala Lys He 
10 210 215 220 

Gly Pro Asp. Gly Ser Ala Cys Gly Arg Leu Leu Pro Val Phe Val He 
225 230 235 240 

Pro Pro Ala Cys Glu Asp Val Pro Ala Phe Val Ala Ala His Ala Asp 
245 250 255 

15 Pro Arg Arg Phe His Phe His Ala Pro Pro Met Phe Ser Ala Ala Pro 
260 265 270 

Arg Glu He Arg Val Leu His Ser Leu Gly Gly Asp Tyr Val Ser Phe 

275 280 285 

Phe Glu Lys Lys Ala Ser Arg Asn Ala Leu Glu His Phe Gly Arg Arg 
20 290 295 300 

Glu Thr Leu Thr Glu Val Leu Gly Arg Tyr Asp Val Arg Pro Asp Ala 
305 310 315 320 

Gly Glu Thr Val Glu Gly Phe Ala Ser Glu Leu Leu Gly Arg He Val 
325 330 335 

25 Ala Cys He Glu Ala His Phe Pro Glu His Ala Arg Glu Tyr Gin Ala 
340 345 350 

Val Ser Val Arg Arg Ala Val He Lys Asp Asp Trp Val Leu Leu Gin 

355 360 365 

Leu He Pro Gly Arg Gly Ala Leu Asn Gin Ser Leu Ser Cys Leu Arg 
30 370 375 380 

Phe Lys His Gly Arg Ala Ser Arg Ala Thr Ala Arg Thr Phe Leu Ala 
385 390 395 400 

Leu Ser Val Gly Thr Asn Asn Arg Leu Cys Ala Ser Leu Cys Gin Gin 
405 410 415 

35 Cys Phe Ala Thr Lys Cys Asp Asn Asn Arg Leu His Thr Leu Phe Thr 
420 425 430 

Val Asp Ala Gly Thr Pro Cys Ser Arg Ser Ala Pro Ser Ser Thr Ser 

435 440 445 

Arg Pro Ser Ser Ser 
40 450 

(2) INFORMATION FOR SEQ ID NO: 267: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:267: 

10 

Met Leu Ala Val Arg Ser Leu Gin His Leu Thr Thr Val lie Phe He 

1 .5 10 15 

Thr Ala Tyr Gly Leu Val Leu Ala Trp Tyr He Val Phe Gly Asp Leu 
20 25 30 

15 His Arg Cys He Tyr Ala Val Arg Pro Ala Gly Ala His Asn Asp Thr 
35 40 45 

Ala Leu Val Trp Met Lys He Asn Gin Thr Leu Leu Phe Leu Gly Pro 

50 . 55 60 

Pro Thr Ala Pro Pro Gly Gly Ala Trp Thr Pro His Ala His Val Cys 
20 65 70 75 80 

Tyr Ala Asn He He Glu Gly Arg Ala Val Ser Leu Pro Ala He Pro 

85 90 95 

Gly Ala Met Ser Arg Arg Val Met Asn Val His Glu Ala Val Asn Cys 
100 105 110 

25 Leu Glu Ala Leu Trp Asp Thr Gin Met Arg Leu Val Val Val Gly Trp 
115 120 125 

Phe Leu Tyr Leu Ala Phe Val His Gin Arg Arg Cys Met Phe Gly Val 

130 135 140 

Val Ser Pro Ala His Ser Met Val Ala Pro Ala Thr Tyr Leu Leu Asn 
30 145 150 155 160 

Tyr Ala Gly Arg He Val Ser Ser Val Phe Leu Gin Tyr Pro Tyr Thr 

165 170 175 

Lys He Thr Arg Leu Leu Cys Glu Leu Ser Val Gin Arg Gin Thr Leu 
180 185 190 

35 Val Gin Leu Phe Glu Ala Asp Pro Val Thr Phe Leu Tyr His Arg Pro 
195 - 200 205 

Ala Val Gly Val He Val Gly Cys Glu Leu Leu Leu Arg Phe Val Gly 

210 215 220 

Leu He Val Gly Thr Ala Leu He Ser Arg Gly Ala Cys Ala He Thr 
40 225 230 235 240 

Tyr Pro. Leu Phe Leu Thr lie Thr Thr Trp Cys Phe Val Ser He He 

245 250 255 

Ala Leu Thr Glu Leu Tyr Phe He Leu Arg Arg Asp Ser Ala Pro Lys 
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10 



20 



260 265 270 

Asn Ala Glu Pro Ala Ala Pro Arg Gly Arg Ser Lys Gly Trp Ser Gly 

275 280 285 

Val Cys Gly Arg Cys Cys Ser lie He Leu Ser Gly He Ala Val Arg 

290 295 300 

Leu Cys Tyr He Ala Val Val Ala Gly Val Val Leu Met Ala Leu Arg 
305 310 315 320 

Tyr Glu Gin Glu He Gin Arg Arg Leu Phe Asp Leu 
325 330 

(2) INFORMATION FOR SEQ ID NO: 268: 



(i) SEQUENCE CHARACTERISTICS : 
<A) LENGTH: 117 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 



Met He Gly Ala His Pro Gly Val Gly Gly Asp Leu Pro Ser Gly Leu 
15 10 15 

25 Pro Thr Tyr Ala Glu Ala Thr Ser Asp Arg Pro Pro Thr Tyr Ala Met 
20 25 30 

Val Met Ala Ala Cys Pro Thr Glu Pro Pro Gly Gly Ser Val Gly Pro 

35 40 45 

Ala Asp Gin Pro Arg Val Gin Ser Ser Arg Thr Trp Arg Pro Pro Leu 
30 50 55 60 

Val Asn Ser Arg Glu Leu Tyr Arg Ala Gin Arg Ala Ala Arg Cys Ala 
65 70 75 80 

Ser Ser Ser Asp Thr Pro Gin Ala Pro Gly Trp Cys Gly Gly Thr Cys 
85 90 95 

35 Arg His Ala Val Phe Gly Val Val Ala Val Val Val Val He He Leu 
100 105 HO 

Ala Phe Leu Trp Arg 
115 



40 (2) INFORMATION FOR SEQ ID NO: 2 69: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 194 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 69: 

Met His Leu Phe Cys Gin Cys Pro Leu Thr Asp Gly Gin Asp Leu Tyr 
10 1 5 10 15 

Leu Cys Pro Val Tyr Pro Arg Met His Gin Glu His Leu Val Cys Pro 

20 25 30 

Leu His Arg Leu Asp Asp Ala Arg Arg Arg Gly Arg Thr Ser Ala Ala 
35 40 45 

15 Trp Asp Glu Gly Leu Val Arg Ala Leu Thr His Ser Gly Gly Leu Met 
50 55 60 

Gly Cys Gly Gly Arg Ser Leu Thr Leu Ser Glu Thr Tyr Trp Gly His 
65 70 75 80 

Pro Leu Tyr Glu Lys Leu Val Pro Trp Asp His Pro Arg Asp Leu Lys 
20 85 90 95 

Val Pro Glu Ala Ser Ala Val Gly Thr Arg Ala Leu Val Pro Arg Gly 

100 105 110 

Arg Gly Arg Pro Leu Arg Gly Arg Pro Val Pro Leu He Pro Leu Asp 
115 120 125 

25 Cys Glu Pro Asn Asp Gly Leu Pro Phe Gly Gly Gly Trp Pro Gly Gly 
130 135 140 

Arg Leu Arg Gly Ala Pro Val Pro Leu His Pro Pro Pro Pro Ser Ala 
145 150 155 160 

Pro Pro Leu Ser Phe Thr Pro Thr Leu Thr Pro Pro Cys Leu Cys Arg 
30 165 170 175 

Gly Leu Ser Leu Cys Val Val Val Lys Gin Tyr Leu Lys Asp Arg Asn 
180 185 190 

Asn Phe 



35 



(2) INFORMATION FOR SEQ ID NO:270: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 853 amino acids 
40 (B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(ii) MOLECULE 



TYPE: peptide 



(xi) SEQUENCE 



DESCRIPTION: SEQ ID NO: 270: 



5 Met Asn Val Ala Thr Cys Thr His Gin Thr His His Ala Ala Arg Ala 
15 10 15 

Pro Gly Ala Thr Ser Ala Pro Gly Ala Ala Ser Gly Asp Pro Leu Gly 

20 25 30 

Ala Arg Arg Pro He Gly Asp Asp Glu Cys Glu Gin Tyr Thr Ser Ser 
10 35 40 45 

Val Ser Leu Ala Arg Met Leu Tyr Gly Gly Asp Leu Ala Glu Trp Val 

50 55 60 

Pro Arg Val His Pro Lys Thr Thr He Glu Arg Gin Gin His Gly Pro 
65 70 75 80 

15 Val Thr Phe Pro Asp Ala Ser Ala Pro Thr Ala Arg Cys Val Thr Val 

85 90 95 

Val Arg Ala Pro Met Gly Ser Gly Lys Thr Thr Ala Leu He Arg Trp 

100 105 110 

Leu Gly Glu Ala He His Ser Pro Asp Thr Ser Val Leu Val Val Ser 
20 115 120 125 

Cys Arg Arg Ser Phe Thr Gin Thr Leu Ala Thr Arg Phe Ala Glu Ser 

130 135 140 

Gly Leu Pro Asp Phe Val Thr Tyr Phe Ser Ser Thr Asn Tyr He Met 
145 150 155 160 

25 Asn Asp Arg Pro Phe His Arg Leu lie Val Gin Val Glu Ser Leu His 

165 170 175 

Arg Val Gly Pro Asn Leu Leu Asn Asn Tyr Asp Val Leu Val Leu Asp 

180 185 190 

Glu Val Met Ser Thr Leu Gly Gin Lys Pro Thr Met Gin Gin Leu Gly 
30 195 200 205 

Arg Val Asp Ala Leu Met Leu Arg Leu Leu Arg Thr Cys Pro Arg He 

210 215 220 

He Ala Met Asp Ala Thr Ala Asn Ala Gin Leu Val Asp Phe Leu Cys 
225 230 235 240 

35 Ser Leu Arg Gly Glu Lys Asn Val His Val Val He Gly Glu Tyr Ala 

245 250. 255 

Met Pro Gly Phe Ser Ala Arg Arg Cys Leu Phe Leu Pro Arg Leu Gly 

260 265 270 

Pro Glu Val Leu Gin Ala Ala Leu Arg Pro Pro Gly Pro Ala Gly Gly 
40 275 280 285 

Ala Pro Pro Pro Asp Ala Pro Pro Asp Ala Thr Phe Phe Gly Glu Leu 

290 295 300 

Glu Ala Arg Leu Ala Gly Gly Asp Asn Val Cys He Phe Ser Ser Thr 
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305 310 315 320 

Val Ser Phe Ala Glu Val Val Ala Arg Phe Cys Arg Gin Phe Thr Asp 

325 330 335 

Arg Val Leu Leu Leu His Ser Leu Thr Pro Pro Gly Asp Val Thr Thr 
5 340 345 350 

Trp Gly Arg Tyr Arg Val Val He Tyr Thr Thr Val Val Thr Val Gly 

355 360 365 

Leu Ser Phe Asp Pro Pro His Phe Asp Ser Met Phe Ala Tyr Val Lys 
370 375 380 

10 Pro Met Asn Tyr Gly Pro Asp Met Val Ser Val Tyr Gin Ser Leu Gly 
385 390 395 400 

Arg Val Arg Thr Leu Arg Lys Gly Glu Leu Leu He Tyr Met Asp Gly 

405 410 415 

Ser Gly Ala Arg Ser Glu Pro Val Phe Thr Pro Met Leu Leu Asn His 
15 420 425 430 

Val Val Ser Ala Ser Gly Gin Trp Pro Ala Gin Phe Ser Gin Val Thr 

435 440 445 

Asn Leu Leu Cys Arg Arg Phe Lys Gly Arg Cys Asp Ala Ser His Ala 
450 455 460 

20 Asp Ala Ala Gin Arg Ser Arg He Tyr Ser Lys Phe Arg Tyr Lys His 
465 470 475 480 

Tyr Phe Glu Arg Cys Thr Leu Ala Cys Leu Ala Asp Ser Leu Asn He 

485 490 495 

Leu His Met Leu Leu Thr Leu Asn Cys Met His Val Arg Phe Trp Gly 
25 500 505 510 

His Asp Ala Ala Leu Thr Pro Arg Asn Phe Cys Leu Phe Leu Arg Gly 

515 520 525 

He His Phe Asp Ala Leu Arg Ala Gin Arg Asp Leu Arg Glu Leu Arg 
530 535 540 

30 Cys Gin Asp Pro Asp Thr Ser Leu Ser Ala Gin Ala Ala Glu Thr Glu 
545 550 555 560 

Glu Val Gly Leu Phe Val Glu Lys Tyr Leu Arg Pro Asp Val Ala Pro 

565 570 575 

Ala Glu Val Val Met Arg Gin Ser Leu Val Gly Arg Thr Arg Phe He 
35 580 585 590 

Tyr Leu Val Leu Leu Glu Ala Cys Leu Arg Val Pro Met Ala Ala His 

595 600 605 

Ser Ser Ala He Phe Arg Arg Leu Tyr Asp His Tyr Ala Thr Gly Val 
610 615 620 

40 He Pro Thr He Asn Ala Ala Gly Glu Leu Glu Leu Val His Pro Thr 
625 630 635 640 

Leu Asn Val Ala Pro Val Trp Glu Leu Phe Arg Leu Cys Ser Thr Met 
645 650 655 
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Ala Ala Cys Leu Gin Tip Asp Ser Met Ala Gly Gly Ser Gly Arg Thr 

660 665 670 

Phe Ser Pro Glu Asp Val Leu Glu Leu Leu Asn Pro His Tyr Asp Arg 
675 680 685 

5 Tyr Met Gin Leu Val Phe Glu Leu Gly His Cys Asn Val Thr Asp Gly 
690 695 700 

Pro Leu Leu Ser Glu Asp Ala Val Lys Arg Val Ala Asp Ala Leu Ser 
705 710 715 720 

Gly Cys Pro Pro Arg Gly Ser Val Ser Glu Thr Glu His Ala Leu Ser 
10 725 730 735 

Leu Phe Lys lie He Trp Gly Glu Leu Phe Gly Val Gin Leu Ala Lys 

740 .745 750 

Ser Thr Gin Thr Phe Pro Gly Ala Gly Arg Val Lys Asn Leu Thr Lys 
755 760 765 

15 Arg Ala He Val Glu Leu Leu Asp Ala His Arg He Asp His Ser Ala 
770 775 780 

Cys Arg Thr Gin Leu Tyr Ala Leu Leu Met Ala His Lys Arg Glu Phe 
785 790 795 800 

Ala Gly Ala Arg Phe Lys Leu Arg Ala Pro Ala Trp Gly Arg Cys Leu 
20 805 810 815 

Arg Thr His Ala Ser Gly Ala Gin Pro Asn Thr Asp He He Ala Ala 

820 825 830 

Leu Ser Glu Leu Pro Thr Glu Ala Trp Pro Met Met Gin Gly Ala Val 
835 840 845 

25 Asn Phe Ser Thr Leu 
850 

' (2) INFORMATION FOR SEQ ID NO: 271: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 857 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 

<ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:271: 

40 Met Ala Glu Thr Met Asn Val Ala Thr Cys Thr His Gin Thr His His 
15 10 15 

Ala Ala Arg Ala Pro Gly Ala Thr Ser Ala Pro Gly Ala Ala Ser Gly 
20 25 30 
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Asp Pro Leu Gly Ala Arg Arg Pro lie Gly Asp Asp Glu Cys Glu Gin 

35 40 45 

Tyr Thr Ser Ser Val Ser Leu Ala Arg Met Leu Tyr Gly Gly Asp Leu 
50 55 60 

5 Ala Glu Trp Val Pro Arg Val His Pro Lys Thr Thr lie Glu Arg Gin 
65 70 75 80 

Gin His Gly Pro Val Thr Phe Pro Asp Ala Ser Ala Pro Thr Ala Arg 

85 90 95 

Cys Val Thr Val Val Arg Ala Pro Met Gly Ser Gly Lys Thr Thr Ala 
10 100 105 110 

Leu lie Arg Trp Leu Gly Glu Ala He His Ser Pro Asp Thr Ser Val 

115 120 125 

Leu Val Val Ser Cys Arg Arg Ser Phe Thr Gin Thr Leu Ala Thr Arg 
130 135 140 

15 Phe Ala Glu Ser Gly Leu Pro Asp Phe Val Thr Tyr Phe Ser Ser Thr 
145 150 155 160 

Asn Tyr He Met Asn Asp Arg Pro Phe His Arg Leu He Val Gin Val 

165 170 175 

Glu Ser Leu His Arg Val Gly Pro Asn Leu Leu Asn Asn Tyr Asp Val 
20 180 185 190 

Leu Val Leu Asp Glu Val Met Ser Thr Leu Gly Gin Lys Pro Thr Met 

195 200 205 

Gin Gin Leu Gly Arg Val Asp Ala Leu Met Leu Arg Leu Leu Arg Thr 
210 215 220 

25 Cys Pro Arg He He Ala Met Asp Ala Thr Ala Asn Ala Gin Leu Val 
225 230 235 240 

Asp Phe Leu Cys Ser Leu Arg Gly Glu Lys Asn Val His Val Val He 

245 250 255 

Gly Glu Tyr Ala Met Pro Gly Phe Ser Ala Arg Arg Cys Leu Phe Leu 
30 260 265 270 

Pro Arg Leu Gly Pro Glu Val Leu Gin Ala Ala Leu Arg Pro Pro Gly 

275 280 285 

Pro Ala Gly Gly Ala Pro Pro Pro Asp Ala Pro Pro Asp Ala Thr Phe 
290 295 300 

35 Phe Gly Glu Leu Glu Ala Arg Leu Ala Gly Gly Asp Asn Val Cys He 
305 3i0 315 320 

Phe Ser Ser Thr Val Ser Phe Ala Glu Val Val Ala Arg Phe Cys Arg 

325 330 335 

Gin Phe Thr Asp Arg Val Leu Leu Leu His Ser Leu Thr Pro Pro Gly 
40 340 345 350 

Asp Val Thr Thr Trp Gly Arg Tyr Arg Val Val He Tyr Thr Thr Val 

355 360 365 

Val Thr Val Gly Leu Ser Phe Asp Pro Pro His Phe Asp Ser Met Phe 
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370 375 380 . 

Ala Tyr Val Lys Pro Met Asn Tyr Gly Pro Asp Met Val Ser Val Tyr 
385 390 _ 395 400 

Gin Ser Leu Gly Arg Val Arg Thr Leu Arg Lys Gly Glu Leu Leu lie 
5 405 410 415 

Tyr Met Asp Gly Ser Gly Ala Arg Ser Glu Pro Val Phe Thr Pro Met 

420 425 430 

Leu Leu Asn His Val Val Ser Ala Ser Gly Gin Trp Pro Ala Gin Phe 
435 440 445 

10 Ser Gin Val Thr Asn Leu Leu Cys Arg Arg Phe Lys Gly Arg Cys Asp 
450 455 460 

Ala Ser His Ala Asp Ala Ala Gin Arg Ser Arg lie Tyr Ser Lys Phe 
465 470 475 480 

Arg Tyr Lys His Tyr Phe Glu Arg Cys Thr Leu Ala Cys Leu Ala Asp' 
15 485 490 495 

Ser Leu Asn lie Leu His Met Leu Leu Thr Leu Asn Cys Met His Val 

500 505 510 

Arg Phe Trp Gly His Asp Ala Ala Leu Thr Pro Arg Asn Phe Cys Leu 
515 520 525 

20 Phe Leu Arg Gly lie His Phe Asp Ala Leu Arg Ala Gin Arg Asp Leu 
530 535 540 

Arg Glu Leu Arg Cys Gin Asp Pro Asp Thr Ser Leu Ser Ala Gin Ala 
545 550 555 560 

Ala Glu Thr Glu Glu Val Gly Leu Phe Val Glu Lys Tyr Leu Arg Pro 
25 565 570 575 

Asp Val Ala Pro Ala Glu Val Val Met Arg Gin Ser Leu Val Gly Arg 

580 585 590 

Thr Arg Phe lie Tyr Leu Val Leu Leu Glu Ala Cys Leu Arg Val Pro 
595 600 605 

30 Met Ala Ala His Ser Ser Ala lie Phe Arg Arg Leu Tyr Asp His Tyr 
610 615 620 

Ala Thr Gly Val lie Pro Thr lie Asn Ala Ala Gly Glu Leu Glu Leu 
625 630 635 640 

Val His Pro Thr Leu Asn Val Ala Pro Val Trp Glu Leu Phe Arg Leu 
35 645 650 655 

Cys Ser Thr Met Ala Ala Cys Leu Gin Trp Asp Ser Met Ala Gly Gly 

660 665 670 

Ser Gly Arg Thr Phe Ser Pro Glu Asp Val Leu Glu Leu Leu Asn Pro 
675 680 685 

40 His Tyr Asp Arg Tyr Met Gin Leu Val Phe Glu Leu Gly His Cys Asn 
690 695 700 

Val Thr Asp Gly Pro Leu Leu Ser Glu Asp Ala Val Lys Arg Val Ala 
705 710 715 720 
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Asp Ala Leu Ser Gly Cys Pro Pro Arg Gly Ser Val Ser Glu Thr Glu 

725 730 735 

His Ala Leu Ser Leu Phe Lys lie lie Trp Gly Glu Leu Phe Gly Val 
740 745 750 

5 Gin Leu Ala Lys Ser Thr Gin Thr Phe Pro Gly Ala Gly Arg Val Lys 
755 760 765 

Asn Leu Thr Lys Arg Ala lie Val Glu Leu Leu Asp Ala His Arg lie 

770 775 780 

Asp His Ser Ala Cys Arg Thr Gin Leu Tyr Ala Leu Leu Met Ala His 
10 785 790 795 800 

Lys Arg Glu Phe Ala Gly Ala Arg Phe Lys Leu Arg Ala Pro Ala Trp 

805 810 815 

Gly Arg Cys Leu Arg Thr His Ala Ser Gly Ala Gin Pro Asn Thr Asp 
820 825 830 

15 lie lie Ala Ala Leu Ser Glu Leu Pro Thr Glu Ala Trp Pro Met Met 
835 840 845 

Gin Gly Ala Val Asn Phe Ser Thr Leu 
850 855 

20 (2) INFORMATION FOR SEQ ID NO: 272: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1370 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide' 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 

Met Glu Pro Ala Asn Pro Pro Arg Asn Pro Met Ala Ala Pro Ala Arg 

15 10 15 

Asp Pro Pro Gly Tyr Arg Tyr Ala Ala Ala Met Val Pro Thr Gly Ser 
35 20 25 30 

lie Leu Ser Thr lie Glu Val Ala Ser His Arg Arg Leu Phe Asp Phe 

35 40 45 

Phe Ala Arg Val Arg Ser Asp Glu Asn Ser Leu Tyr Asp Val Glu Phe 
50 55 60 

40 Asp Ala Leu Leu Gly Ser Tyr Cys Asn Thr Leu Ser Leu Val Arg Phe 
65 70 75 80 

Leu Glu Leu Gly Leu Ser Val Ala Cys Val Cys Thr Lys Phe Pro Glu 
85 90 95 
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Leu Ala Tyr Met Asn Glu Gly Arg Val Gin Phe Glu Val His Gin Pro 

100 105 110 

Leu He Ala Arg Asp Gly Pro His Pro Val Glu Gin Pro Val His Asn 
115 120 125 

5 Tyr Met Thr Lys Val He Asp Arg Arg Ala Leu Asn Ala Ala Phe Ser 
130 135 140 

Leu Ala Thr Glu Ala He Ala Leu Leu Thr Gly Glu Ala Leu Asp Gly 
145 150 155 160 

Thr Gly He Ser Leu His Arg Gin Leu Arg Ala He Gin Gin Leu Ala 
10 165 170 175 

Arg Asn Val Gin Ala Val Leu Gly Ala Phe Glu Arg Gly Thr Ala Asp 

180 185 190 

Gin Met Leu His Val Leu Leu Glu Lys Ala Pro Pro Leu Ala Leu Leu 
195 200 205 

15 Leu Pro Met Gin Arg Tyr Leu Asp Asn Gly Arg Leu Ala Thr Arg Val 
210 215 220 

Ala Arg Ala Thr Leu Val Ala Glu Leu Lys Arg Ser Phe Cys Asp Thr 
225 230 235 240 

Ser Phe Phe Leu Gly Lys Ala Gly His Arg Arg Glu Ala He Glu Ala 
20 245 250 255 

Trp Leu Val Asp Leu Thr Thr Ala Thr Gin Pro Ser Val Ala Val Pro 

260 265 270 

Arg Leu Thr His Ala Asp Thr Arg Gly Arg Pro Val Asp Gly Val Leu 
275 280 285 

25 Val Thr Thr Ala Ala He Lys Gin Arg Leu Leu Gin Ser Phe Leu Lys 
290 295 300 

Val Glu Asp Thr Glu Ala Asp Val Pro Val Thr Tyr Gly Glu Met Val 
305 310 315 320 

Leu Asn Gly Ala Asn Leu Val Thr Ala Leu Val Met Gly Lys Ala Val 
30 325 330 335 

Arg Ser Leu Asp Asp Val Gly Arg His Leu Leu Glu Met Gin Glu Glu 

340 345 350 

Gin Leu Glu Ala Asn Arg Glu Thr Leu Asp Glu Leu Glu Ser Ala Pro 
355 360 365 

35 Gin Thr Thr Arg Val Arg Ala Asp Leu Val Ala He Gly Asp Arg Leu 
370 375 380 

Val Phe Leu Glu Ala Leu Glu Lys Arg He Tyr Ala Ala Thr Asn Val 
385 390 395 400 

Pro Tyr Pro Leu Val Gly Ala Met Asp Leu Thr Phe Val Leu Pro Leu 
40 405 410 415 

Gly Leu Phe Asn Pro Ala Met Glu Arg Phe Ala Ala His Ala Gly Asp 

420 425 430 

Leu Val Pro Ala Pro Gly His Pro Glu Pro Arg Ala Phe Pro Pro Arg 
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435 440 445 

Gin Leu Phe Phe Trp Gly Lys Asp His Gin Val Leu Arg Leu Ser Met 

450 455 460 

Glu Asn Ala Val Gly Thr Val Cys His Pro Ser Leu Met Asn He Asp 
5 465 470 475 480 

Ala Ala Val Gly Gly Val Asn His Asp Pro Val Glu Ala Ala Asn Pro 

485 490 495 

Tyr Gly Ala Tyr Val Ala Ala Pro Ala Gly Pro Gly Ala Asp Met Gin 
500 505 510 

10 Gin Arg Phe Leu Asn Ala Trp Arg Gin Arg Leu Ala His Gly Arg Val 
515 520 525 

Arg Trp Val Ala Glu Cys Gin Met Thr Ala Glu Gin Phe Met Gin Pro 

530 535 540 

Asp Asn Ala Asn Leu Ala Leu Glu Leu His Pro Ala Phe Asp Phe Phe 
15 545 550 555 560 

Ala Gly Val Ala Asp Val Glu Leu Pro Gly Gly Glu Val Pro Pro Ala 

565 570 575 

Gly Pro Gly Ala He Gin Ala Thr Trp Arg Val Val Asn Gly Asn Leu 
580 585 590 

20 Pro Leu Ala Leu Cys Pro Val Ala Phe Arg Asp Arg Leu Glu Leu Gly 
595 600 605 

Val Gly Arg His Ala Met Ala Pro Ala Thr He Ala Ala Val Arg Gly 

610 615 620 

Ala Phe Glu Asp Arg Ser Tyr Pro Ala Val Phe Tyr Leu Leu Gin Ala 
25 625 630 635 640 

Ala He His Gly Ser Glu His Val Phe Cys Ala Arg Leu Val Thr Gin 

645 650 655 

Cys He Thr Ser Tyr Trp Asn Asn Thr Arg Cys Ala Ala Phe Val Asn 
660 665 670 

30 Asp Tyr Ser Leu Val Ser Tyr He Val Thr Tyr Leu Gly Gly Asp Leu 
675 680 685 

Pro Glu Glu Cys Met Ala Val Tyr Arg Asp Leu Val Ala His Val Glu 

690 695 700 

Ala Gin Leu Val Asp Asp Phe Thr Leu Pro Gly Pro Glu Leu Gly Gly 
35 705 710 715 720 

Gin Ala Gin Ala Glu Leu Asn His Leu Met Arg Asp Pro Ala Leu Leu 

725 730 735 

Pro Pro Leu Val Trp Asp Cys Asp Gly Leu Met Arg His Ala Ala Leu 
740 745 750 

40 Asp Arg His Arg Asp Cys Arg He Asp Ala Gly Gly His Glu Pro Val 
755 760 765 

Tyr Ala Ala Ala Cys Asn Val Ala Thr Ala Asp Phe Asn Arg Asn Asp 
770 775 780 
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Gly Arg Leu Leu His Asn Thr Gin Ala Arg Ala Ala Asp Ala Ala Asp 
785 790 795 800 

Asp Arg Pro His Arg Pro Ala Asp Trp Thr Val His His Lys lie Tyr 
805 810 815 

5 Tyr Tyr Val Leu Val Pro Ala Phe Ser Arg Gly Arg Cys Cys Thr Ala 
820 825 830 

Gly Val Arg Phe Asp Arg Val Tyr Ala Thr Leu Gin Asn Met Val Val 

835 840 845 

Pro Glu He Ala Pro Gly Glu Glu Cys Pro Ser Asp Pro Val Thr Asp 
10 850 855 860 

Pro Ala His Pro Leu His Pro Ala Asn Leu Val Ala Asn Thr Val Asn 
865 870 875 880 

Ala Met Phe His Asn Gly Arg Val Val Val Asp Gly Pro Ala Met Leu 
885 890 895 

15 Thr Leu Gin Val Leu Ala His Asn Met Ala Glu Arg Thr Thr Ala Leu 
900 905 910 

Leu Cys Ser Ala Ala Pro Asp Ala Gly Ala Asn Thr Ala Ser Thr Ala 

915 920 925 

Asn Met Arg He Phe Asp Gly Ala Leu His Ala Gly Val Leu Leu Met 
20 930 935 940 

Ala Pro Gin His Leu Asp His Thr He Gin Asn Gly Glu Tyr Phe Tyr 
945 950 955 960 

Val Leu Pro Val His Ala Leu Phe Ala Gly Ala Asp His Val Ala Asn 
965 970 975 

25 Ala Pro Asn Phe Pro Pro Ala Leu Arg Asp Leu Ala Arg His Val Pro 
980 985 990 

Leu Val Pro Pro Ala Leu Gly Ala Asn Tyr Phe Ser Ser He Arg Gin 

995 1000 1005 

Pro Val Val Gin His Ala Arg Glu Ser Ala Ala Gly Glu Asn Ala Leu 
30 1010 1015 1020 

Thr Tyr Ala Leu Met Ala Gly Tyr Phe Lys Met Ser Pro Val Tyr His 
1025 1030 1035 104 

Gin Leu Lys Thr Gly Leu His Pro Gly Phe Gly Phe Thr Val Val Arg 
1045 1050 1055 

35 Gin Asp Arg Phe Val Thr Glu Asn Val Leu Phe Ser Ala Ser Glu Ala 
1060 1065 * 1070 

Tyr Phe Leu Gly Gin Leu Gin Val Ala Arg His Glu Thr Gly Gly Gly 

1075 1080 1085 

Val Ser Phe Thr Leu Thr Gin Pro Arg Gly Asn Val Asp Leu Gly Val 
40 1090 1095 HOO 

Gly Tyr Thr Ala Val Ala Ala Thr Ala Thr Val Arg Asn Pro Val Thr 
1105 1110 1115 112 

Asp Met Gly Asn Leu Pro Gin Asn Phe Tyr Leu Gly Arg Gly Ala Pro 
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1125 1130 1135 

Pro Leu Leu Asp Asn Ala Ala Ala Val Tyr Leu Arg Asn Ala Val Val 

1140 . _ 1145 1150 

Ala Gly Asn Arg Leu Gly Pro Ala Gin Pro Leu Pro Val Phe Gly Cys 
5 1155 1160 1165 

Ala Gin Val Pro Arg Arg Ala Gly Met Asp His Gly Gin Asp Ala Val 

1170 1175 1180 

Cys Glu Phe lie Ala Thr Pro Val Ala Thr Asp He Asn Tyr Phe Arg 
1185 1190 1195 120 

10 Arg Pro Cys Asn Pro Arg Gly Arg Ala Ala Gly Gly Val Tyr Ala Gly 

1205 1210 1215 

Asp Lys Glu Gly Asp Val He Ala Leu Met Tyr Asp His Gly Gin. Ser 

1220 1225 1230 

Asp Pro Ala Arg Pro Phe Ala Ala Thr Ala Asn Pro Trp Ala Ser Gin 
15 1235 1240 1245 

Arg Phe Ser Tyr Gly Asp Leu Leu Tyr Asn Gly Ala Tyr His Leu Asn 

1250 1255 1260 

Gly Asp Val Leu Ser Pro Cys Phe Lys Phe Phe Thr Ala Ala Asp He 
1265 1270 1275 128 

20 Thr Ala Lys His Arg Cys Leu Glu Arg Leu He Val Glu Thr Gly Ser 

1285 1290 1295 

Ala Val Ser Thr Ala Thr Ala Ala Ser Asp Val Gin Phe Lys Arg Pro 

1300 1305 1310 

Pro Gly Cys Arg Glu Leu Val Glu Asp Pro Cys Gly Leu Phe Gin Glu 
25 1315 1320 1325 

Ala Tyr Pro He Thr Cys Ala Ser Asp Pro Ala Leu Leu Arg Ser Ala 

1330 1335 1340 

Arg. Asp Gly Glu Ala His Ala Arg Glu Thr His Phe Thr Gin Tyr Leu 
1345 1350 1355 136 

30 He Tyr Asp Asp Leu Lys Gly Leu Ser Leu 

1365 1370 



(2) INFORMATION FOR SEQ ID NO: 273: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1360 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:273: 
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Met Ala Ala Pro Ala Arg Asp Pro Pro Gly Tyr Arg Tyr Ala Ala Ala 

1 5 10 15 

Met Val Pro Thr Gly Ser He Leu Ser Thr He Glu Val Ala Ser His 
5 20 25 30 

Arg Arg Leu Phe Asp Phe Phe Ala Arg Val Arg Ser Asp Glu Asn Ser 

35 40 45 

Leu Tyr Asp Val Glu Phe Asp Ala Leu Leu Gly Ser Tyr Cys Asn Thr 
50 55 60 

10 Leu Ser Leu Val Arg Phe Leu Glu Leu Gly Leu Ser Val Ala Cys Val 
65 70 75 80 

Cys Thr Lys Phe Pro Glu Leu Ala Tyr Met Asn Glu Gly Arg Val Gin 

85 90 95 

Phe Glu Val His Gin Pro Leu He Ala Arg Asp Gly Pro His Pro Val 
15 100 105 110 

Glu Gin Pro Val His Asn Tyr Met Thr Lys Val He Asp Arg Arg Ala 

115 120 125 

Leu Asn Ala Ala Phe Ser Leu Ala Thr Glu Ala He Ala Leu Leu Thr 
130 135 140 

20 Gly Glu Ala Leu Asp Gly Thr Gly He Ser Leu His Arg Gin Leu Arg 
145 150 155 160 

Ala He Gin Gin Leu Ala Arg Asn Val Gin Ala Val Leu Gly Ala Phe 

165 170 175 

Glu Arg Gly Thr Ala Asp Gin Met Leu His Val Leu Leu Glu Lys Ala 
25 180 185 190 

Pro Pro Leu Ala Leu Leu Leu Pro Met Gin Arg Tyr Leu Asp Asn Gly 

195 200 205 

Arg Leu Ala Thr Arg Val Ala Arg Ala Thr Leu Val Ala Glu Leu Lys 
210 215 220 

30 Arg Ser Phe Cys Asp Thr Ser Phe Phe Leu Gly Lys Ala Gly His Arg 
225 230 235 240 

Arg Glu Ala He Glu Ala Trp Leu Val Asp Leu Thr Thr Ala Thr Gin 

245 250 255 

Pro Ser Val Ala Val Pro Arg Leu Thr His Ala Asp Thr Arg Gly Arg 
35 260 265 270 

Pro Val Asp Gly Val Leu Val Thr Thr Ala Ala He Lys Gin Arg Leu 

275 280 285 

Leu Gin Ser Phe Leu Lys Val Glu Asp Thr Glu Ala Asp Val Pro Val 
290 295 300 

40 Thr Tyr Gly Glu Met Val Leu Asn Gly Ala Asn Leu Val Thr Ala Leu 
305 310 315 320 

Val Met Gly Lys Ala Val Arg Ser Leu Asp Asp Val Gly Arg His Leu 
325 330 335 
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Leu Glu Met Gin Glu Glu Gin Leu Glu Ala Asn Arg Glu Thr Leu Asp 

340 345 350 

Glu Leu Glu Ser Ala Pro Gin Thr Thr Arg Val Arg Ala Asp Leu Val 
355 360 365 

5 Ala lie Gly Asp Arg Leu Val Phe Leu Glu Ala Leu Glu Lys Arg He 
370 375 380 

Tyr Ala Ala Thr Asn Val Pro Tyr Pro Leu Val Gly Ala Met Asp Leu 
385 390 395 400 

Thr Phe Val Leu Pro Leu Gly Leu Phe Asn Pro Ala Met Glu Arg Phe 
10 405 410 415 

Ala Ala His Ala Gly Asp Leu Val Pro Ala Pro Gly His Pro Glu Pro 

420 425 430 

Arg Ala Phe Pro Pro Arg Gin Leu Phe Phe Trp Gly Lys Asp His Gin 
435 440 445 

15 Val Leu Arg Leu Ser Met Glu Asn Ala Val Gly Thr Val Cys His Pro 
450 455 460 

Ser Leu Met Asn He Asp Ala Ala Val Gly Gly Val Asn His Asp Pro 
465 470 475 480 

Val Glu Ala Ala Asn Pro Tyr Gly Ala Tyr Val Ala Ala Pro Ala Gly 
20 485 490 495 

Pro Gly Ala Asp Met Gin Gin Arg Phe Leu Asn Ala Trp Arg Gin Arg 

500 505 510 

Leu Ala His Gly Arg Val Arg Trp Val Ala Glu Cys Gin Met Thr Ala 
515 520 525 

25 Glu Gin Phe Met Gin Pro Asp Asn Ala Asn Leu Ala Leu Glu Leu His 
530 535 540 

Pro Ala Phe Asp Phe Phe Ala Gly Val Ala Asp Val Glu Leu Pro Gly 
545 550 555 560 

Gly Glu Val Pro Pro Ala Gly Pro Gly Ala He Gin Ala Thr Trp Arg 
30 565 570 575 

Val Val Asn Gly Asn Leu Pro Leu Ala Leu Cys Pro Val Ala Phe Arg 

580 585 590 

Asp Arg Leu Glu Leu Gly Val Gly Arg His Ala Met Ala Pro Ala Thr 
595 600 605 

35 He Ala Ala Val Arg Gly Ala Phe Glu Asp Arg Ser Tyr Pro Ala Val 
610 615 620 

Phe Tyr Leu Leu Gin Ala Ala He His Gly Ser Glu His Val Phe Cys 
625 630 635 640 

Ala Arg Leu Val Thr Gin Cys He Thr Ser Tyr Trp Asn Asn Thr Arg 
40 645 650 655 

Cys Ala Ala Phe Val Asn Asp Tyr Ser Leu Val Ser Tyr He Val Thr 

660 665 670 

Tyr Leu Gly Gly Asp Leu Pro Glu Glu Cys Met Ala Val Tyr Arg Asp 
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675 680 685 

Leu Val Ala His Val Glu Ala Gin Leu Val Asp Asp Phe Thr Leu Pro 

690 695- 700 

Gly Pro Glu Leu Gly Gly Gin Ala Gin Ala. Glu Leu Asn His Leu Met 
5 705 710 715 720 

Arg Asp Pro Ala Leu Leu Pro Pro Leu Val Trp Asp Cys Asp Gly Leu 

725 730 735 

Met Arg His Ala Ala Leu Asp Arg His Arg Asp Cys Arg lie Asp Ala 
740 745 750 

10 Gly Gly His Glu Pro Val Tyr Ala Ala Ala Cys Asn Val Ala Thr Ala 
755 760 765 

Asp Phe Asn Arg Asn Asp Gly Arg Leu Leu His Asn Thr Gin Ala Arg 

770 775 780 

Ala Ala Asp Ala Ala Asp Asp Arg Pro His Arg Pro Ala Asp Trp Thr 
15 785 790 795 800 

Val His His Lys lie Tyr Tyr Tyr Val Leu Val Pro Ala Phe Ser Arg 

805 810 815 

Gly Arg Cys Cys Thr Ala Gly Val Arg Phe Asp Arg . Val Tyr Ala Thr 
820 825 830 

20 Leu Gin Asn Met Val Val Pro Glu He Ala Pro Gly Glu Glu Cys Pro 
835 840 . 845 

Ser Asp Pro Val Thr Asp Pro Ala His Pro Leu His Pro Ala Asn Leu 

850 855 860 

Val Ala Asn Thr Val Asn Ala Met Phe His Asn Gly Arg Val Val Val 
25 865 870 875 880 

Asp Gly Pro Ala Met Leu Thr Leu Gin Val Leu Ala His Asn Met Ala 

885 890 895 

Glu Arg Thr Thr Ala Leu Leu Cys Ser Ala Ala Pro Asp Ala Gly Ala 
900 905 910 

30 Asn Thr Ala Ser Thr Ala Asn Met Arg He Phe Asp Gly Ala Leu His 
915 920 925 

Ala Gly Val Leu Leu Met Ala Pro Gin His Leu Asp His Thr He Gin 

930 935 940 

Asn Gly Glu Tyr Phe Tyr Val Leu Pro Val His Ala Leu Phe Ala Gly 
35 945 950 955 960 

Ala Asp His Val Ala Asn Ala Pro Asn Phe Pro Pro Ala Leu Arg Asp 

965 970 975 

Leu Ala Arg His Val Pro Leu Val Pro Pro Ala Leu Gly Ala Asn Tyr 
980 985 990 

40 Phe Ser Ser lie Arg Gin Pro Val Val Gin His Ala Arg Glu Ser Ala 
995 1000 1005 

Ala Gly Glu Asn Ala Leu Thr Tyr Ala Leu Met Ala Gly Tyr Phe Lys 
1010 1015 1020 
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Met Ser Pro Val Tyr His Gin Leu Lys Thr Gly Leu His Pro Gly Phe 
1025 1030 1035 104 

Gly Phe Thr Val Val Arg Gin- Asp Arg Phe Val Thr Glu Asn Val Leu 
1045 1050 1055 

5 Phe Ser Ala Ser Glu Ala Tyr Phe Leu Gly Gin Leu Gin Val Ala Arg 
1060 1065 1070 

His Glu Thr Gly Gly Gly Val Ser Phe Thr Leu Thr Gin Pro Arg Gly 

1075 1080 1085 

Asn Val Asp Leu Gly Val Gly Tyr Thr Ala Val Ala Ala Thr Ala Thr 
10 1090 1095 1100 

Val Arg Asn Pro Val Thr Asp Met Gly Asn Leu Pro Gin Asn Phe Tyr 
1105 1110 1115 112 

Leu Gly Arg Gly Ala Pro Pro Leu Leu Asp Asn Ala Ala Ala Val Tyr 
1125 1130 1135 

15 Leu Arg Asn Ala Val Val Ala Gly Asn Arg Leu Gly Pro Ala Gin Pro 
1140 1145 1150 

Leu Pro Val Phe Gly Cys Ala Gin Val Pro Arg Arg Ala Gly Met Asp 

1155 1160 1165 

His Gly Gin Asp Ala Val Cys Glu Phe He Ala Thr Pro Val Ala Thr 
20 1170 1175 1180 

Asp He Asn Tyr Phe Arg Arg Pro Cys Asn Pro Arg Gly Arg Ala Ala 
1185 1190 1195 120 

Gly Gly Val Tyr Ala Gly Asp Lys Glu Gly Asp Val He Ala Leu Met 
1205 1210 1215 

25 Tyr Asp His Gly Gin Ser Asp Pro Ala Arg Pro Phe Ala Ala Thr Ala 
1220 1225 1230 

Asn Pro Trp Ala Ser Gin Arg Phe Ser Tyr Gly Asp Leu Leu Tyr Asn 

1235 1240 1245 

Gly Ala Tyr His Leu Asn Gly Asp Val Leu Ser Pro Cys Phe Lys Phe 
30 1250 1255 1260 

Phe Thr Ala Ala Asp He Thr Ala Lys His Arg Cys Leu Glu Arg Leu 
1265 1270 1275 128 

He Val Glu Thr Gly Ser Ala Val Ser Thr Ala Thr Ala Ala Ser Asp 
1285 1290 1295 

35 Val Gin Phe Lys Arg Pro Pro Gly Cys Arg Glu Leu Val Glu Asp Pro 
1300 1305 1310 

Cys Gly Leu Phe Gin Glu Ala Tyr Pro He Thr Cys Ala Ser Asp Pro 

1315 1320 1325 

Ala Leu Leu Arg Ser Ala Arg Asp Gly Glu Ala His Ala Arg Glu Thr 
40 1330 1335 1340 

His Phe Thr Gin Tyr Leu He Tyr Asp Asp Leu Lys Gly Leu Ser Leu 
1345 1350 1355 136 
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(2) INFORMATION FOR SEQ ID NO: 274: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 604 amino acids 

5 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 274: 

Met Arg Pro Glu Leu Ser Leu Lys Gly Arg Pro Cys Val Thr Glu Ala 
1 5 10 15 

15 Val Val Cys Pro Ser Thr Asp Ala Ala lie His Ser Gly Gly Ser Ser 
20 25 30 

Ser Val Arg Pro Gin Pro Tyr Ala Arg Ala Ala Arg Ala Arg Ala Thr 

35 40 45 

His Gly Ser Arg Ser Arg His Arg Gin Pro Leu Leu Pro Pro Pro Ser 
20 50 55 60 

Ser His His Pro Thr lie Pro Pro Pro Pro Ser Pro Pro Arg Gly Ser 
65 70 75 80 

Pro Ala Met Glu Leu Ser Tyr Ala Thr Thr Leu His His Arg Asp Val 
85 90 95 

25 Val Phe Tyr Val Thr Ala Asp Arg Asn Arg Ala Tyr Phe Val Cys Gly 
100 105 110 

Gly Ser Val Tyr Ser Val Gly Arg Pro Arg Asp Ser Gin Pro Gly Glu 

115 120 125 

He Ala Lys Phe Gly Leu Val Val Arg Gly Thr Gly Pro Lys Asp Arg 
30 130 135 140 

Met Val Ala Asn Tyr Val Arg Ser Glu Leu Arg Gin Arg Gly Leu Arg 
145 150 155 160 

Asp Val Arg Pro Val Gly Glu Asp Glu Val Phe Leu Asp Ser Val Cys 
165 170 175 

35 Leu Leu Asn Pro Asn Val Ser Ser Asp Val He Asn Thr Asn Asp Val 
180 185 .190 

Glu Val Leu Asp Glu Cys Leu Ala Glu Tyr Cys Thr Ser Leu Arg Thr 

195 200 205 

Ser Pro Gly Val Leu Val Thr Gly Val Arg Val Arg Ala Arg Asp Arg 
40 210 215 220 

Val He Glu Leu Phe Glu His Pro Ala He Val Asn He Ser Ser Arg 
225 230 235 240 

Phe Ala Tyr Thr Pro Ser Pro Tyr Val Phe Ala Gin Ala His Leu Pro 
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245 250 255 

Arg Leu Pro Ser Ser Leu Glu Pro Leu Val Ser Gly Leu Phe Asp Gly 

260 265 270 

He Pro Ala Pro Arg Gin Pro Leu Asp Ala Arg Asp Arg Arg Thr Asp 
5 275 280 285 

Val Val He Thr Gly Thr Arg Ala Pro Arg Pro Met Ala Gly Thr Gly 

290 295 300 

Ala Gly Gly Ala Gly Ala Lys Arg Ala Thr Val Ser Glu Phe Val Gin 
305 310 315 320 

10 Val Lys His He Asp Arg Val Val Ser Pro Ser Val Ser Ser Ala Pro 

325 330 335 

Pro Pro Ser Ala Pro Asp Ala Ser Leu Pro Pro Pro Gly Leu Gin Glu 

340 345 350 

Ala Ala Pro Pro Gly Pro Pro Leu Arg Glu Leu Trp Trp Val Phe Tyr 
15 355 360 365 

Ala Gly Asp Arg Ala Leu Glu Glu Pro His Ala Glu Ser Gly Leu Thr 

370 375 380 

Arg Glu Glu Val Arg Ala Val His Gly Phe Arg Glu Gin Ala Trp Lys 
385 390 395 400 

20 Leu Phe Gly Ser Val Gly Ala Pro Arg Ala Phe Leu Gly Ala Ala Leu 

405 410 415 

Ser Pro Thr Gin Lys Leu Ala Val Tyr Tyr Tyr Leu He His Arg Glu 

420 425 430 

Arg Arg Met Ser Pro Phe Pro Ala Leu Val Arg Leu Val Gly Arg Tyr 
25 435 440 445 

lie Gin Arg His Gly Val Pro Ala Pro Asp Glu Pro Thr Leu Ala Asp 

450 455 . 460 

Ala Met Asn Gly Leu Phe Arg Asp Ala Ala Gly Thr Val Ala Glu Gin 
465 470 475 480 

30 Leu Leu Met Phe Asp Leu Leu Pro Pro Lys. Asp Val Pro Val Gly Ser 

485 490 495 

Asp Ala Arg Ala Asp Ser Ala Ala Leu Leu Arg Phe Val Asp Ser Gin 

500 505 510 

Arg Leu Thr Pro Gly Gly Ser Val Ser Pro Glu His Val Met Tyr Leu 
35 515 520 525 

Gly Ala Phe Leu Gly Val Leu Tyr Ala Gly His Gly Arg Leu Ala Ala 

530 535 540 

Ala Thr His Thr Ala Arg Leu Thr Gly Val Thr Ser Leu Val Leu Thr 
545 550 555 560 

40 Val Gly Asp Val Asp Arg Met Ser Ala Phe Asp Arg Gly Pro Ala Gly 

565 570 575 

Ala Ala Gly Arg Thr Arg Thr Ala Gly Tyr Leu Asp Ala Leu Leu Thr 
580 585 590 
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Val Cys Leu Ala Arg Ala Gin His Gly Gin Ser Val 
595 600 

(2) INFORMATION FOR SEQ ID NO: 275: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 522 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 275: 

15 

Met Glu Leu Ser Tyr Ala Thr Thr Leu His His Arg Asp Val Val Phe 

15 10 15 

Tyr Val Thr Ala Asp Arg Asn Arg Ala Tyr Phe Val Cys Gly Gly Ser 
20 25 30 

20 Val Tyr Ser Val Gly Arg Pro Arg Asp Ser Gin Pro Gly Glu He Ala 
35 40 45 

Lys Phe Gly Leu Val Val Arg Gly Thr Gly Pro Lys Asp Arg Met Val 

50 55 60 

Ala Asn Tyr Val Arg Ser Glu Leu Arg Gin Arg Gly Leu Arg Asp Val 
25 65 70 75 80 

Arg Pro Val Gly Glu Asp Glu Val Phe Leu Asp Ser Val Cys Leu Leu 

85 90 95 

Asn Pro Asn Val Ser Ser Asp Val He Asn Thr Asn Asp Val Glu Val 
100 105 110 

30 Leu Asp Glu Cys Leu Ala Glu Tyr Cys Thr Ser Leu Arg Thr Ser Pro 
115 120 125 

Gly Val Leu Val Thr Gly Val Arg Val Arg Ala Arg Asp Arg Val He 

130 135 140 

Glu Leu Phe Glu His Pro Ala He Val Asn He Ser Ser Arg Phe Ala 
35 145 150 155 160 

Tyr Thr Pro Ser Pro Tyr Val Phe Ala Gin Ala His Leu Pro Arg Leu 

165 170 175 

Pro Ser Ser Leu Glu Pro Leu Val Ser Gly Leu Phe Asp Gly He Pro 
180 185 190 

40 Ala Pro Arg Gin Pro Leu Asp Ala Arg Asp Arg Arg Thr Asp Val Val 
195 200 205 

He Thr Gly Thr Arg Ala Pro Arg Pro Met Ala Gly Thr Gly Ala Gly 
210 215 220 
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Gly Ala Gly Ala Lys Arg Ala Thr Val Ser Glu Phe Val Gin Val Lys 
225 230 235 240 

His He Asp Arg Val Val Ser, Pro Ser Val Ser Ser Ala Pro Pro Pro 
245 250 255 

5 Ser Ala Pro Asp Ala Ser Leu Pro Pro Pro Gly Leu Gin Glu Ala Ala 
260 265 270 

Pro Pro Gly Pro Pro Leu Arg Glu Leu Trp Trp Val Phe Tyr Ala Gly 

275 280 285 

Asp Arg Ala Leu Glu Glu Pro His Ala Glu Ser Gly Leu Thr Arg Glu 
10 290 295 300 

Glu Val Arg Ala Val His Gly Phe Arg Glu Gin Ala Trp Lys Leu Phe 
305 310 315 320 

Gly Ser Val Gly Ala Pro Arg Ala Phe Leu Gly Ala Ala Leu Ser Pro 
325 330 335 

15 Thr Gin Lys Leu Ala Val Tyr Tyr Tyr Leu He His Arg Glu Arg Arg 
340 345 350 

Met Ser Pro Phe Pro Ala Leu Val Arg Leu Val Gly Arg Tyr He Gin 

355 360 365 

Arg His Gly Val Pro Ala Pro Asp Glu Pro Thr Leu Ala Asp Ala Met 
20 370 375 380 

Asn Gly Leu Phe Arg Asp Ala Ala Gly Thr Val Ala Glu Gin Leu Leu 
385 390 395 400 

Met Phe Asp Leu Leu Pro Pro Lys Asp Val Pro Val Gly Ser Asp Ala 
405 410 415 

25 Arg Ala Asp Ser Ala Ala Leu Leu Arg Phe Val Asp Ser Gin Arg Leu 
420 425 430 

Thr Pro Gly Gly Ser Val Ser Pro Glu His Val Met Tyr Leu Gly Ala 

435 440 445 

Phe Leu Gly Val Leu Tyr Ala Gly His Gly Arg Leu Ala Ala Ala Thr 
30 450 455 460 

His Thr Ala Arg Leu Thr Gly Val Thr Ser Leu Val Leu Thr Val Gly 
465 470 475 480 

Asp Val Asp Arg Met Ser Ala Phe Asp Arg Gly Pro Ala Gly Ala Ala 
485 490 495 

35 Gly Arg Thr Arg Thr Ala Gly Tyr Leu Asp Ala Leu Leu Thr Val Cys 
500 505 510 

Leu Ala Arg Ala Gin His Gly Gin Ser Val 
515. 520 

40 (2) INFORMATION FOR SEQ ID NO: 276: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 602 amino acids 
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<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

{D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 276: 

Met Thr Ala Ala Ala Leu Tyr Gly Gly Ala Lys Tyr Arg Pro Gly Thr 
10 1 5 10 15 

Leu Arg Asn Pro Gly Arg Val Ala Ser Thr Pro Arg Arg Arg Gly Val 

20 25 30 

Leu Tyr Gly Ala Leu Cys Pro Gly lie Pro Phe Val Gly Ser Gly Pro 
35 40 45 

15 Gly Ala Val Gly Trp Glu Cys Val Cys Val Gly Gly Gly Arg Arg Asp 
50 55 60 

Gly Gly Pro Asp Gin Val Tyr Arg Gly Arg Ser Val Gly Arg Pro Asn 
65 70 75 80 

Arg Pro Phe Lys His Leu Arg Met His Arg Pro Ser Gin Ser Asp Thr 
20 85 90 95 

Gly Thr His Gin Arg Arg Lys Pro Pro Ser Pro Val Arg Val Arg Val 

100 105 110 

Phe Ser Gly Gly Val Phe Phe Leu Ser Ala Leu Leu Pro Pro His Leu 
115 120 125 

25 His His Pro Pro Pro Thr Trp Leu Ala He Gly Gly Lys Thr Met Lys 
130 135 140 

Thr Lys Pro Leu Pro Thr Ala Pro Met Ala Trp Ala Glu Ser Ala Val 
145 150 155 160 

Glu Thr Thr Thr Ser Pro Arg Glu Leu Ala Gly His Ala Pro Leu Arg 
30 165 170 175 

Arg Val Leu Arg Pro Pro He Ala Arg Arg Asp Gly Pro Val Leu Leu 

180 185 190 

Gly Asp Arg Ala Pro Arg Arg Thr Ala Ser Thr Met Trp Leu Leu Gly 
195 200 205 

35 He Asp Pro Ala Glu Ser Ser Pro Gly Thr Arg Ala Thr Arg Asp Asp 
210 215 220 

Thr Glu Gin Ala Val Asp Lys lie Leu Arg Gly Ala Arg Arg Ala Gly 
225 230 235 240 

Gly Leu Thr Val Pro Gly Ala Pro Arg Tyr His Leu Thr Arg Gin Val 
40 245 250 255 

Thr Leu Thr Asp Leu Cys Gin Pro Asn Ala Glu Arg Ala Gly Ala Leu 

260 265 270 

Leu Leu Ala Leu Arg His Pro Thr Asp Leu Pro His Leu Ala Arg His 
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275 280 285 

Arg Ala Pro Pro Gly Arg Gin Thr Glu Arg Leu Ala Glu Ala Trp Gly 

290 295,. 300 

Gin Leu Leu Glu Ala Ser Ala Leu Gly Ser Gly Arg Ala Glu Ser Gly 
5 305 310 315 320 

Cys Ala Arg Ala Gly Leu Val Ser Phe Asn Phe Leu Val Ala Ala Cys 

325 330 335 

Ala Ala Ala Tyr Asp Ala Arg Asp Ala Ala Glu Ala Val Arg Ala His 
340 345 350 

10 He Thr Thr Asn Tyr Gly Gly Thr Arg Ala Gly Ala Arg Leu Asp Arg 
355 360 365 

Phe Ser Glu Cys Leu Arg Ala Met Val His Thr His Val Phe Phe Val 

370 375 380 

Met Arg Phe Phe Gly Gly Leu Val Ser Trp Val Thr Gin Asp Glu Leu 
15 385 390 395 400 

Ala Ser Val Thr Ala Val Cys Ser Gly Pro Gin Glu Ala Thr His Thr 

405 410 415 

Gly His Pro Gly Arg Pro Cys Ser Ala Val Thr He Pro Ala Cys Ala 
420 425 430 

20 Phe Val Asp Leu Asp Ala Glu Leu Cys Leu Gly Gly Pro Gly Ala Ala 
435 440 445 

Phe Leu Tyr Leu Val Phe Tyr Gin Cys Arg Asp Gin Glu Leu Cys Cys 

450 455 460 

Val Tyr Val Val Lys Ser Gin Leu Pro Pro Arg Gly Leu Glu Ala Ala 
25 465 470 475 480 

Leu Glu Arg Leu Phe Gly Arg Leu Arg He Thr Asn Thr He His Gly 

485 490 495 

Ala Glu Asp Met Thr Pro Pro Pro Pro Asn Arg Asn Val Asp Phe Pro 
500 505 510 

30 Leu Ala Val Leu Ala Ala Ser Ser Gin Ser Pro Arg Cys Ser Ala Ser 
515 520 525 

Gin Val Thr Asn Pro Gin Phe Val Asp Arg Leu Tyr Arg Trp Gin Pro 

530 535 540 

Asp Leu Arg Gly Arg Pro Thr Ala Arg Thr Cys Thr Tyr Ala Ala Phe 
35 545 550 555 560 

Ala Glu Leu Gly Val Met Pro Asp Asn Ser Pro Arg Cys Leu His Arg 

565 570 575 

Thr Glu Arg Phe Gly Ala Val Gly Val Pro Val Val He Gly Val Val 
580 585 590 

40 Trp Arg Pro Gly Gly Trp Arg Ala Cys Ala 
595 600 



(2) INFORMATION FOR SEQ ID NO: 277: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 515 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 277: 

Met His Arg Pro Ser Gin Ser Asp Thr Gly Thr His Gin Arg Arg Lys 

15 10 15 

Pro Pro Ser Pro Val Arg Val Arg Val Phe Ser Gly Gly Val Phe Phe 
15 20 25 30. 

Leu Ser Ala Leu Leu Pro Pro His Leu His His Pro Pro Pro Thr Trp 

35 .40 45 

Leu Ala lie Gly Gly Lys Thr Met Lys Thr Lys Pro. Leu Pro Thr Ala 
50 55 60 

20 Pro Met Ala Trp Ala Glu Ser Ala Val Glu Thr Thr Thr Ser Pro Arg 
65 70 75 80 

Glu Leu Ala Gly His Ala Pro Leu Arg Arg Val Leu Arg Pro Pro lie 

85 90 95 

Ala Arg Arg Asp Gly Pro Val Leu Leu Gly Asp Arg Ala Pro Arg Arg 
25 100 105 110 

Thr Ala Ser Thr Met Trp Leu Leu Gly lie Asp Pro Ala Glu Ser Ser 

115 120 125 

Pro Gly Thr Arg Ala Thr Arg Asp Asp Thr Glu Gin Ala Val Asp Lys 
130 135 140 

30 lie Leu Arg Gly Ala Arg Arg Ala Gly Gly Leu Thr Val Pro Gly Ala 
145 150 155 160 

Pro Arg Tyr His Leu Thr Arg Gin Val Thr Leu Thr Asp Leu Cys Gin 

165 170 175 

Pro Asn Ala Glu Arg Ala Gly Ala Leu Leu Leu Ala Leu Arg His Pro 
35 180 . 185 190 

Thr Asp Leu Pro His Leu Ala Arg His Arg Ala Pro Pro Gly Arg Gin 

195 200 205 

Thr Glu Arg Leu Ala Glu Ala Trp Gly Gin Leu Leu Glu Ala Ser Ala 
210 215 220 

40 Leu Gly Ser Gly Arg Ala Glu Ser Gly Cys Ala Arg Ala Gly Leu Val 
225 230 235 240 

Ser Phe Asn Phe Leu Val Ala Ala Cys Ala Ala Ala Tyr Asp Ala Arg 
245 250 255 
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Asp Ala Ala Glu Ala Val Arg Ala His lie Thr Thr Asn Tyr Gly Gly 

260 265 270 

Thr Arg Ala Gly Ala Arg Leu. Asp Arg Phe Ser Glu Cys Leu Arg Ala 
275 280 285 

5 Met Val His Thr His Val Phe Phe Val Met Arg Phe Phe Gly Gly Leu 
290 295 300 

Val Ser Trp Val Thr Gin Asp Glu Leu Ala Ser Val Thr Ala Val Cys 
305 310 315 320 

Ser Gly Pro Gin Glu Ala Thr His Thr Gly His Pro Gly Arg Pro Cys 
10 325 330 335 

Ser Ala Val Thr lie Pro Ala Cys Ala Phe Val Asp Leu Asp Ala Glu 

340 345 350 

Leu Cys Leu Gly Gly Pro Gly Ala Ala Phe Leu Tyr Leu Val Phe Tyr 
355 360 365 

15 Gin Cys Arg Asp Gin Glu Leu Cys Cys Val Tyr Val Val Lys Ser Gin 
370 375 380 

Leu Pro Pro Arg Gly Leu Glu Ala Ala Leu Glu Arg Leu Phe Gly Arg 
385 390 395 400 

Leu Arg lie Thr Asn Thr He His Gly Ala Glu Asp Met Thr Pro Pro 
20 405 410 415 

Pro Pro Asn Arg Asn Val Asp Phe Pro Leu Ala Val Leu Ala Ala Ser 

420 425 430 

Ser Gin Ser Pro Arg Cys Ser Ala Ser Gin Val Thr Asn Pro Gin Phe 
435 440 445 

25 Val Asp Arg Leu Tyr Arg Trp Gin Pro Asp Leu Arg Gly Arg Pro Thr 
450 455 460 

Ala Arg Thr Cys Thr Tyr Ala Ala Phe Ala Glu Leu Gly Val Met Pro 
465 470 475 480 

Asp Asn Ser Pro Arg Cys Leu His Arg Thr Glu Arg Phe Gly Ala Val 
30 485 490 495 

Gly Val Pro Val Val He Gly Val Val Trp Arg Pro Gly Gly Trp Arg 
500 505 510 

Ala Cys Ala 
515 



35 



(2) INFORMATION FOR SEQ ID NO: 278: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 460 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE 



TYPE: peptide 



(xi) SEQUENCE 



DESCRIPTION: SEQ ID 



NO:278: 



5 Met Lys Thr Lys Pro Leu Pro Thr Ala Pro Met Ala Trp Ala Glu Ser 
15 10 15 

Ala Val Glu Thr Thr Thr Ser Pro Arg Glu Leu Ala Gly His Ala Pro 

20 25 30 

Leu Arg Arg Val Leu Arg Pro Pro lie Ala Arg Arg Asp Gly Pro Val 
10 35 40 45 

Leu Leu Gly Asp Arg Ala Pro Arg Arg Thr Ala Ser Thr Met Trp Leu 

50 55 60 

Leu Gly lie Asp Pro Ala Glu Ser Ser Pro Gly Thr Arg Ala Thr Arg 
65 70 75 80 

15 Asp Asp Thr Glu Gin Ala Val Asp Lys lie Leu Arg Gly Ala Arg Arg 

85 90 95 

Ala Gly Gly Leu Thr Val Pro Gly Ala Pro Arg Tyr His Leu Thr Arg 

100 105 110 

Gin Val Thr Leu Thr Asp Leu Cys Gin Pro Asn Ala Glu Arg Ala Gly 
20 115 120 125 

Ala Leu Leu Leu Ala Leu Arg His Pro Thr Asp Leu Pro His Leu Ala 

130 135 140 

Arg His Arg Ala Pro Pro Gly Arg Gin Thr Glu Arg Leu Ala Glu Ala 
145 150 155 160 

25 Trp Gly Gin Leu Leu Glu Ala Ser Ala Leu Gly Ser Gly Arg Ala Glu 

165 170 175 

Ser Gly Cys Ala Arg Ala Gly Leu Val Ser Phe Asn Phe Leu Val Ala 

180 185 190 

Ala Cys Ala Ala Ala Tyr Asp Ala Arg Asp Ala Ala Glu Ala Val Arg 
30 195 200 205 

Ala His lie Thr Thr Asn Tyr Gly Gly Thr Arg Ala Gly Ala Arg Leu 

210 215 220 

Asp Arg Phe Ser Glu Cys Leu Arg Ala Met Val His Thr His Val Phe 
225 230 235 240 

35 Phe Val Met Arg Phe Phe Gly Gly Leu Val Ser Trp Val Thr Gin Asp 

245 250 255 

Glu Leu Ala Ser Val Thr Ala Val Cys Ser Gly Pro Gin Glu Ala Thr 

260 265 270 

His Thr Gly His Pro Gly Arg Pro Cys Ser Ala Val Thr He Pro Ala 
40 275 280 285 

Cys Ala Phe Val Asp Leu Asp Ala Glu Leu Cys Leu Gly Gly Pro Gly 

290 295 300 

Ala Ala Phe Leu Tyr Leu Val Phe Tyr Gin Cys Arg Asp Gin Glu Leu 
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305 310 315 320 

Cys Cys Val Tyr Val Val Lys Ser Gin Leu Pro Pro Arg Gly Leu Glu 

325 330 335 

Ala Ala Leu Glu Arg Leu Phe Gly Arg Leu Arg lie Thr Asn Thr lie 
5 340 345 350 

His Gly Ala Glu Asp Met Thr Pro Pro Pro Pro Asn Arg Asn Val Asp 

355 360 365 

Phe Pro Leu Ala Val Leu Ala Ala Ser Ser Gin Ser Pro Arg Cys Ser 
370 375 380 

10 Ala Ser Gin Val Thr Asn Pro Gin Phe Val Asp Arg Leu Tyr Arg Trp 
385 390 395 400 

Gin Pro Asp Leu Arg Gly Arg Pro Thr Ala Arg Thr Cys Thr Tyr Ala 

-405 410 415 

Ala Phe Ala Glu Leu Gly Val Met Pro Asp Asn Ser Pro Arg Cys Leu 
15 420 425 430 

His Arg Thr Glu Arg Phe Gly Ala Val Gly Val Pro Val Val lie Gly 

435 440 445 

Val Val Trp Arg Pro Gly Gly Trp Arg Ala Cys Ala 
450 455 460 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 279: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 452 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279: 



Met Ala Val Val Cys Gly Ser Gly Leu Arg Leu Arg Pro Phe His Pro 
15 10 15 

35 Pro Ser Pro Ser Phe Phe Val Leu Arg Ala Leu lie Arg Ala Gly Pro 
20 25 30 

Gly Pro Phe Ala Asp Arg Ala Pro Ser Gly Pro Gly Cys Gly Met Cys 

35 40 45 

Arg Gly Asp Ser Pro Gly Val Ala Gly Gly Ser Gly Glu His Cys Leu 
40 50 55 60 

Gly Gly Asp Asp Gly Asp Asp Gly Arg Pro Arg Leu Ala Cys Val Gly 
65 70 75 80 

Ala lie Arg Phe Ala His Leu Trp Leu Gin Ala Thr Thr Leu Gly Phe 
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85 

Val Gly Ser Val Val 
100 

Gly Ala Phe Val He 
5 115 

Pro Ala Phe Ala Arg 
130 

Val Gly Gly Gly Ala 
145 

10 Pro Gly Val Pro Gly 

165 

Tyr Ala Ala Leu Leu 
180 

Leu Ala Pro Arg Pro 
15 195 

Gly Leu Thr He Gly 
210 

Ala Ala Ala Ala Leu 

225 

20 Ala Ala Gly Asp Ser 

245 

Cys Val Val Ser Ala 
260 

Asp Ala Glu Arg Pro 
25 275 

His Gin Arg Ser Pro 
290 

Asn He Trp Val Pro 
305 

30 Ala Arg Ser Asp Ala 

325 

Gin Val Phe Val Gly 
340 

Gin Thr Leu Ala Pro 
35 355 

Val Gly Phe Gin Val 
370 

He Ala Val Tyr Ala 
385 

40 Gin Val Leu Gly Leu 

405 

Pro Trp Ala Ala Ala 
420 



90 95 
Leu Ser Arg Gly Pro Tyr Ala Asp Ala Met Ser 

105 110 
Gly Ser Thr Gly Leu Gly Phe Leu Arg Ala Pro 

120 125 
Pro Pro Thr Arg Val Cys Ala Trp Leu Arg Leu 

135 140 
Ala Val Trp Ser Leu Gly Glu Ala Gly Ala Pro 
150 155 160 

Pro Ala Thr Gin Cys Leu Ala Leu Gly Ala Ala 

170 175 
Val Leu Ala Asp Asp Val His Pro Leu Phe Leu 

185 190 
Leu Phe Val Gly Thr Leu Gly Val Val Val Gly 

200 205 
Gly Ser Ala Arg Tyr Trp Trp He Asp Pro Arg 

215 220 
Thr Ala Ala Val Val Ala Gly Leu Gly Thr Thr 
230 235 240 

Phe Ser Lys Ala Cys Pro Arg His Arg Arg Phe 

250 255 
Val Glu Ser Pro Pro Pro Arg Tyr Ala Pro Glu 

265 270 
Thr Asp His Gly Pro Leu Leu Pro Ser Thr His 

280 285 
Arg Val Cys Gly Asp Gly Ala Ala Arg Pro Glu 

295 300 
Val Val Thr Phe Ala Gly Ala Leu Ala Ala Cys 
310 315 320 

Ala Pro Ser Gly Pro Val Leu Pro Leu Trp Pro 

330 335 
Gly His Ala Ala Ala Gly Leu Thr Glu Leu Cys 

345 350 
Arg Asp Leu Thr Asp Pro Leu Leu Phe Ala Tyr 

360 365 
Val Asn His Gly Leu Met Phe Val Val Pro Asp 

375 380 
Met Leu Gly Gly Ala Val Trp He Ser Leu Thr 
390 395 400 

Arg Arg Arg Leu His Lys Asp Pro Asp Ala Gly 

410 415 
Thr Leu Arg Gly Leu Phe Phe Ser Val Tyr Ala 
425 430 
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Leu Gly Phe Ala Ala Gly Val Leu Val Arg Pro Arg Met Ala Ala Ser 

435 440 445 

Arg Arg Ser Gly 
450 

5 

(2) INFORMATION FOR SEQ ID NO: 2 80: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 406 amino acids 
10 (B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 



Met Cys Arg Gly Asp Ser Pro Gly Val Ala Gly Gly Ser Gly Glu His 
15 10 15 

20 Cys Leu Gly Gly Asp Asp Gly Asp Asp Gly Arg Pro Arg Leu Ala Cys 
20 25 30 

Val Gly Ala lie Arg Phe Ala His Leu Trp Leu Gin Ala Thr Thr Leu 

35 40 45 

Gly Phe Val Gly Ser Val Val Leu Ser Arg Gly Pro Tyr Ala Asp Ala 
25 50 55 60 

Met Ser Gly Ala Phe Val lie Gly Ser Thr Gly Leu Gly Phe Leu Arg 
65 70 75 80 

Ala Pro Pro Ala Phe Ala Arg Pro Pro Thr Arg Val Cys Ala Trp Leu 
85 90 95 

30 Arg Leu Val Gly Gly Gly Ala Ala Val Trp Ser Leu Gly Glu Ala Gly 
100 105 110 

Ala Pro Pro Gly Val Pro Gly Pro Ala Thr Gin Cys Leu Ala Leu Gly 

115 120 125 

Ala Ala Tyr Ala Ala Leu Leu Val Leu Ala Asp Asp Val His Pro Leu 
35 130 135 140 

Phe Leu Leu Ala Pro Arg Pro Leu Phe Val Gly Thr Leu Gly Val Val 
145 150 155 160 

Val Gly Gly Leu Thr lie Gly Gly Ser Ala Arg Tyr Trp Trp lie Asp 
165 170 175 

40 Pro Arg Ala Ala Ala Ala Leu Thr Ala Ala Val Val Ala Gly Leu Gly 
180 185 190 

Thr Thr Ala Ala Gly Asp Ser Phe Ser Lys Ala Cys Pro Arg His Arg 
195 200 205 
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Arg Phe Cys Val Val Ser Ala Val Glu Ser Pro Pro Pro Arg Tyr Ala 

210 215 220 

Pro Glu Asp Ala Glu Arg Pro,. Thr Asp His Gly Pro Leu Leu Pro Ser 
225 230 235 240 

5 Thr His His Gin Arg Ser Pro Arg Val Cys Gly Asp Gly Ala Ala Arg 

245 250 255 

Pro Glu Asn lie Trp Val Pro Val Val Thr Phe Ala Gly Ala Leu Ala 

260 265 270 

Ala Cys Ala Arg Ser Asp Ala Ala Pro Ser Gly Pro Val Leu Pro Leu 
10 275 280 285 

Trp Pro Gin Val Phe Val Gly Gly His Ala Ala Ala Gly Leu Thr Glu 

290 295 300 

Leu Cys Gin Thr Leu Ala Pro Arg Asp Leu Thr Asp Pro Leu Leu Phe 
305 310 315 320 

15 Ala Tyr Val Gly Phe Gin Val Val Asn His Gly Leu Met Phe Val Val 

325 330 335 

Pro Asp lie Ala Val Tyr Ala Met Leu Gly Gly Ala Val Trp lie Ser 

340 345 350 

Leu Thr Gin Val' Leu Gly Leu Arg Arg Arg Leu His Lys Asp Pro Asp 
20 355 360 365 

Ala Gly Pro Trp Ala Ala Ala Thr Leu Arg Gly Leu Phe Phe Ser Val 

370 375 380 

Tyr Ala Leu Gly Phe Ala Ala Gly Val Leu Val Arg Pro Arg Met Ala 
385 390 395 400 

25 Ala Ser Arg Arg Ser Gly 

405 

(2) INFORMATION FOR SEQ ID NO: 281: 

30 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 644 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 

<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 81: 

40 Met Gly Thr Glu Asp Cys Asp His Glu Gly Arg Ser Val Ala Ala Pro 
15 10 15 

Val Glu Val Met Ala Leu Tyr Ala Thr Asp Gly Cys Val lie Thr Ser 
20 25 30 
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• Ser Leu Ala Leu Leu Thr Asn Cys Leu Leu Gly Ala Glu Pro Leu Tyr 
35 40 45 

lie Phe Ser Tyr Asp Ala Tyr. Arg Pro Asp Ala Pro Asn Gly Pro Thr 
50 55 60 

5 Gly Ala Pro Thr Glu Gin Glu Arg Phe Glu Gly Ser Arg Ala Leu Tyr 
65 70 75 80 

Arg Asp Ala Gly Gin Gly Asp Ser Phe Arg Val Thr Phe Cys Leu Leu 

85 90 95 

Gly Thr Glu Val Gly Val Thr His His Pro Lys Gly Arg Trp Met Phe 
10 100 105 110 

Val Cys Arg Phe Glu Arg Ala Asp Asp Val Ala Val Leu Gin Asp Ala 

115 120 125 

Leu Gly Arg Gly Thr Pro Leu Leu Pro Ala His lie Thr Ala Thr Leu 
130 135 140 

15 Asp Leu Glu Ala Thr Phe Ala Leu His Ala Asn He He Met Ala Leu 
145 150 155 160 

Thr Val Ala He Val His Asn Ala Pro Ala Arg He Gly Ser Gly Ser 

165 170 175 

Thr Ala Pro Leu Tyr Glu Pro Gly Glu Ser Met Arg Ser Val Val Gly 
20 180 185 190 

Arg Met Ser Leu Gly Gin Arg Gly Leu Thr Thr Leu Phe Val His His 

195 200 205 

Glu Ala Arg Val Leu Ala Ala Tyr Arg Arg Ala Tyr Tyr Gly Ser Ala 
210 215 220 

25 Gin Ser Pro Phe Trp Phe Leu Ser Lys Phe Gly Pro Asp Glu Lys Ser 
225 230 235 240 

Leu Val Leu Ala Ala Arg Tyr Tyr Val Leu Gin Ala Pro Arg Leu Gly 

245 250 255 

Gly Ala Gly Ala Thr Tyr Asp Leu Gin Ala Val Lys Asp He Cys Ala 
30 260 265 270 

Thr Tyr Ala lie Pro His Asp Pro Arg Pro Asp Thr Leu Ser Ala Ala 

275 280 285 

Ser Leu Thr Ser Phe Ala Ala He Thr Arg Phe Cys Cys Thr Ser Gin 
290 295 300 

35 Tyr Ser Arg Gly Ala Ala Ala Ala Gly Phe Pro Leu Tyr Val Glu Arg 
305 310 315 320 

Arg He Ala Ala Asp Val Arg Glu Thr Gly Ala Leu Glu Lys Phe He 

325 330 335 

Ala His Asp Arg Ser Cys Leu Arg Val Ser Asp Arg Glu Phe He Thr 
40 340 345 350 

Tyr He Tyr Leu Ala His Phe Glu Cys Phe Ser Pro Pro Arg Leu Ala 

355 360 365 

Thr His Leu Arg Ala Val Thr Thr His Asp Pro Ser Pro Ala Ala Ser 
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370 375 380 

Thr Glu Gin Pro Ser Pro Leu Gly Arg Glu Ala Val Glu Gin Phe Phe 
385 390 395 400 

Arg His Val Arg Ala Gin Leu Asn lie Arg Glu Tyr Val Lys Gin Asn 
5 405 410 415 

Val Thr Pro Arg Glu Thr Ala Gly Asp Ala Ala Ala Ala Tyr Leu Arg 

420 425 430 

Ala Arg Thr Tyr Ala Pro Ala Ala Leu Thr Pro Ala Pro Ala Tyr Cys 
435 440 445 

10 Gly Val Ala Asp Ser Ser Thr Lys Met Met Gly Arg Leu Ala Glu Ala 
450 455 460 

Glu Arg Leu Leu Val Pro His Gly Trp Pro Ala Phe Ala Pro Thr Thr 
465 470 475 480 

Pro Gly Asp Asp Ala Gly Gly Gly Thr Ala Ala Pro Gin Thr Cys Gly 
15 485 490 495 

lie Val Lys Arg Leu Leu Lys Leu Ala Ala Thr Glu Gin Gin Gly Thr 

500 505 510 

Thr Pro Pro Ala He Ala Ala Leu Met Gin Asp Ala Ser Val Gin Thr 
515 520 525 

20 Pro Leu Pro Val Tyr Arg He Thr Met Ser Pro Thr Gly Gin Ala Phe 
530 535 540 

Ala Ala Ala Ala Arg Asp Asp Trp Ala Arg Val Thr Arg Asp Ala Arg 
545 550 555 560 

Pro Pro Glu Ala Thr Val Val Ala Asp Ala Ala Ala Ala Pro Glu Pro 
25 565 570 575 

Gly Ala Leu Gly Arg Arg Leu Thr Arg Arg He Cys Arg Pro Ala Pro 

580 585 590 

Pro Pro Gly Arg Pro Gly Arg Arg Gly Pro Asp Val Arg Glu Pro Gin 
595 600 605 

30 Arg Asp Leu Gin Arg Arg Ala Gly Arg Tyr Glu His His Pro Gly Ser 
610 615 620 

Gly His Arg Pro Glu Gly Ala Arg Pro Leu Ser Pro Ala Pro Arg Gly 
625 630 635 640 

Pro Gly Ser Leu 

35 

(2) INFORMATION FOR SEQ ID NO: 2 82: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 715 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:282: 

5 

Met Gly Ala Gly Lys Ser Ala Leu Thr Thr Ala Arg Ala Ser Cys Ser 

15 10 15 

Arg Gly Ser Xaa Ser Glu Gly Gly Ala Ala Ala Arg lie lie Ser Tyr 
20 25 30 

10 Cys Cys Ser Ser Gly Arg Val Pro Gin Pro His Ser Thr Pro Ser Arg 
35 40 45 

Asp Ala lie Pro Glu His Arg Ser Ala Pro Ala Phe Pro His Pro Thr 

50 55 60 

Pro Ser Gly Phe Ala Gly Ala Met Gly Thr Glu Asp Cys Asp His Glu 
15 65 70 75 80 

Gly Arg Ser Val Ala Ala Pro Val Glu Val Met Ala Leu Tyr Ala Thr 

85 90 95 

Asp Gly Cys Val lie Thr Ser Ser Leu Ala Leu Leu Thr Asn Cys Leu 
100 105 110 

20 Leu Gly Ala Glu Pro Leu Tyr He Phe Ser Tyr Asp Ala Tyr Arg Pro 
115 120 125 

Asp Ala Pro Asn Gly Pro Thr Gly Ala Pro Thr Glu Gin Glu Arg Phe 

130 135 140 

Glu Gly Ser Arg Ala Leu Tyr Arg Asp Ala Gly Gin Gly Asp Ser Phe 
25 145 150 155 160 

Arg Val Thr Phe Cys Leu Leu Gly Thr Glu Val Gly Val Thr His His 

165 170 175 

Pro Lys Gly Arg Trp Met Phe Val Cys Arg Phe Glu Arg Ala Asp Asp 
180 185 190 

30 Val Ala Val Leu Gin Asp Ala Leu Gly Arg Gly Thr Pro Leu Leu Pro 
195 200 205 

Ala His He Thr Ala Thr Leu Asp Leu Glu Ala Thr Phe Ala Leu His 

210 215 220 

Ala Asn He He Met Ala Leu Thr Val Ala He Val His Asn Ala Pro 
35 225 230 235 240 

Ala Arg He Gly Ser Gly Ser Thr Ala Pro Leu Tyr Glu Pro Gly Glu 

245 250 255 

Ser Met Arg Ser Val Val Gly Arg Met Ser Leu Gly Gin Arg Gly Leu 
260 265 270 

40 Thr Thr Leu Phe Val His His Glu Ala Arg Val Leu Ala Ala Tyr Arg 
275 280 285 

Arg Ala Tyr Tyr Gly Ser Ala Gin Ser Pro Phe Trp Phe Leu Ser Lys 
290 295 300 
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Phe Gly Pro Asp Glu Lys Ser Leu Val Leu Ala Ala Arg Tyr Tyr Val 
305 310 315 320 

Leu Gin Ala Pro Arg Leu Gly. Gly Ala Gly Ala Thr Tyr Asp Leu Gin 
325 330 335 

5 Ala Val Lys Asp lie Cys Ala Thr Tyr Ala He Pro His Asp Pro Arg 
340 345 350 

Pro Asp Thr Leu Ser Ala Ala Ser Leu Thr Ser Phe Ala Ala He Thr 

355 360 365 

Arg Phe Cys Cys Thr Ser Gin Tyr Ser Arg Gly Ala Ala Ala Ala Gly 
10 370 375 380 

Phe Pro Leu Tyr Val Glu Arg Arg He Ala Ala Asp Val Arg Glu Thr 
385 390 395 400 

Gly Ala Leu Glu Lys Phe He Ala His Asp Arg Ser Cys Leu Arg Val 
405 410 415 

15 Ser Asp Arg Glu Phe He Thr Tyr He Tyr Leu Ala His Phe Glu Cys 
420 425 430 

Phe Ser Pro Pro Arg Leu Ala Thr His Leu Arg Ala Val Thr Thr His 

435 440 445 

Asp Pro Ser Pro Ala Ala Ser Thr Glu Gin Pro Ser Pro Leu Gly Arg 
20 450 455 460 

Glu Ala Val Glu Gin Phe Phe Arg His Val Arg Ala Gin Leu Asn He 
465 470 475 . 480 

Arg Glu Tyr Val Lys Gin Asn Val Thr Pro Arg Glu Thr Ala Gly Asp 
485 490 495 

25 Ala Ala Ala Ala Tyr Leu Arg Ala Arg Thr Tyr Ala Pro Ala Ala Leu 
500 505 510 

Thr Pro Ala Pro Ala Tyr Cys Gly Val Ala Asp Ser Ser Thr Lys Met 

515 520 525 

Met Gly Arg Leu Ala Glu Ala Glu Arg Leu Leu Val Pro His Gly Trp 
30 530 535 540 

Pro Ala Phe Ala Pro Thr Thr Pro Gly Asp Asp Ala Gly Gly Gly Thr 
545 550 555 560 

Ala Ala Pro Gin Thr Cys Gly He Val Lys Arg Leu Leu Lys Leu Ala 
565 570 575 

35 Ala Thr Glu Gin Gin Gly Thr Thr Pro Pro Ala He Ala Ala Leu Met 
580 585 590 

Gin Asp Ala Ser Val Gin Thr Pro Leu Pro Val Tyr Arg He Thr Met 

595 600 605 

Ser Pro Thr Gly Gin Ala Phe Ala Ala Ala Ala Arg Asp Asp Trp Ala 
40 610 615 620 

Arg Val Thr Arg Asp Ala Arg Pro Pro Glu Ala Thr Val Val Ala Asp 
625 630 635 640 

Ala Ala Ala Ala Pro Glu Pro Gly Ala Leu Gly Arg Arg Leu Thr Arg 
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10 



20 



645 650 655 

Arg lie Cys Arg Pro Ala Pro Pro Pro Gly Arg Pro Gly Arg Arg Gly 

660 .. 665 670 

Pro Asp Val Arg Glu Pro Gin Arg Asp Leu Gin Arg Arg Ala Gly Arg 

675 680 685 

Tyr Glu His His Pro Gly Ser Gly His Arg Pro Glu Gly Ala Arg Pro 

690 695 700 

Leu Ser Pro Ala Pro Arg Gly Pro Gly Ser Leu 
705 710 715 

(2) INFORMATION FOR SEQ ID NO: 283: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 744 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 283: 



Met Gin Ala Trp Tyr Val Arg Ala Arg Ala Arg Ala Phe Thr Arg Arg 
15 10 15 

25 Arg Val Ser Ser Ser Asp Ser Arg Ala Ser Ser Ser Val Met Gly Ala 
20 25 30 

Gly Lys Ser Ala Leu Thr Thr Ala Arg Ala Ser Cys Ser Arg Gly Ser 

35 40 45 

Xaa Ser Glu Gly Gly Ala Ala Ala Arg lie lie Ser Tyr Cys Cys Ser 
30 50 55 60 

Ser Gly Arg Val Pro Gin Pro His Ser Thr Pro Ser Arg Asp Ala He 
65 70 75 80 

Pro Glu His Arg Ser Ala Pro Ala Phe Pro His Pro Thr Pro Ser Gly 
85 90 95 

35 Phe Ala Gly Ala Met Gly Thr Glu Asp Cys Asp His Glu Gly Arg Ser 
100 105 110 

Val Ala Ala Pro Val Glu Val Met Ala Leu Tyr Ala Thr Asp Gly Cys 

115 120 125 

Val He Thr Ser Ser Leu Ala Leu Leu Thr Asn Cys Leu Leu Gly Ala 
40 130 135 140 

Glu Pro Leu Tyr He Phe Ser Tyr Asp Ala Tyr Arg Pro Asp Ala Pro 
145 150 155 160 

Asn Gly Pro Thr Gly Ala Pro Thr Glu Gin Glu Arg Phe Glu Gly Ser 
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165 170 175 

Arg Ala Leu Tyr Arg Asp Ala Gly Gin Gly Asp Ser Phe Arg Val Thr 

180 185 190 

Phe Cys Leu Leu Gly Thr Glu Val Gly Val Thr His His Pro Lys Gly 
5 195 200 205 

Arg Trp Met Phe Val Cys Arg Phe Glu Arg Ala Asp Asp Val Ala Val 

210 215 220 

Leu Gin Asp Ala Leu Gly Arg Gly Thr Pro Leu Leu Pro Ala His lie 
225 230 235 240 

10 Thr Ala Thr Leu Asp Leu Glu Ala Thr Phe Ala Leu His Ala Asn lie 

245 250 255 

He Met Ala Leu Thr Val Ala He Val His Asn Ala Pro Ala Arg He 

260 265 270 

Gly Ser Gly Ser Thr Ala Pro Leu Tyr Glu Pro Gly Glu Ser Met Arg 
15 275 280 285 

Ser Val Val Gly Arg Met Ser Leu Gly Gin Arg Gly Leu Thr Thr Leu 

290 295 300 

Phe Val His His Glu Ala Arg Val Leu Ala Ala Tyr Arg Arg Ala Tyr 
305 310. 315 320 

20 Tyr Gly Ser Ala Gin Ser Pro Phe Trp Phe Leu Ser Lys Phe Gly Pro 

325 330 335 

Asp Glu Lys Ser Leu Val Leu Ala Ala Arg Tyr Tyr Val Leu Gin Ala 

340 345 350 

Pro Arg Leu Gly Gly Ala Gly Ala Thr Tyr Asp Leu Gin Ala. Val Lys 
25 355 360 365 

Asp lie Cys Ala Thr Tyr Ala He Pro His Asp Pro Arg Pro Asp Thr 

370 375 380 

Leu Ser Ala Ala Ser Leu Thr Ser Phe Ala Ala lie Thr Arg Phe Cys 
385 390 395 400 

30 Cys Thr Ser Gin Tyr Ser Arg Gly Ala Ala Ala Ala Gly Phe Pro Leu 

405 410 415 

Tyr Val Glu Arg Arg He Ala Ala Asp Val Arg Glu Thr Gly Ala Leu 

420 425 430 

Glu Lys Phe He Ala His Asp Arg Ser Cys Leu Arg Val Ser Asp Arg 
35 435 440 445 

Glu Phe He Thr Tyr He Tyr Leu Ala His Phe Glu Cys Phe Ser Pro 

450 455 460 

Pro Arg Leu Ala Thr His Leu Arg Ala Val Thr Thr His Asp Pro Ser 
465 470 475 480 

40 Pro Ala Ala Ser Thr Thr Glu Gin Pro Ser Pro Leu Gly Arg Glu Ala 

485 490 495 

Val Glu Gin Phe Phe Arg His Arg Ala Gin Leu Asn He Arg Glu Tyr 
500 505 510 
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Val Lys Gin Asn Val Thr Pro Arg Glu Thr Ala Gly Asp Ala Ala Ala 

515 520 525 

Ala Tyr Leu Arg Ala Arg Thr, Tyr Ala Pro Ala Ala Leu Thr Pro Ala 
530 535 540 

5 Pro Ala Tyr Cys Gly Val Ala Asp Ser Ser Thr Lys Met Met Gly Arg 
545 550 555 560 

Leu Ala Glu Ala Glu Arg Leu Leu Val Pro His Gly Trp Pro Ala Phe 

565 570 575 

Ala Pro Thr Thr Pro Gly Asp Asp Ala Gly Gly Gly Thr Ala Ala Pro 
10 580 585 590 

Gin Thr Cys Gly lie Val Lys Arg Leu Leu Lys Leu Ala Ala Thr Glu 

595 600 605 

Gin Gin Gly Thr Thr Pro Pro Ala lie Ala Ala Leu Met Gin Asp Ala 
610 615 620 

15 Ser Val Gin Thr Pro Leu Pro Val Tyr Arg lie Thr Met Ser Pro Thr 
625 630 635 640 

Gly Gin Ala Phe Ala Ala Ala Ala Arg Asp Asp Trp Ala Arg Val Thr 

645 650 655 

Arg Asp Ala Arg Pro Pro Glu Ala Thr Val Val Ala Asp Ala Ala Ala 
20 660 665 670 

Ala Pro Glu Pro Gly Ala Leu Gly Arg Arg Leu Thr Arg Arg lie Cys 

67.5 680 685 

Arg Pro Ala Pro Pro Pro Gly Arg Pro Gly Arg Arg Gly Pro Asp Val 
690 695 700 

25 Arg Glu Pro Gin Arg Asp Leu Gin Arg Arg Ala Gly Arg Tyr Glu His 
705 710 715 720 

His Pro Gly Ser Gly His Arg Pro Glu Gly Ala Arg Pro Leu Ser Pro 

725 730 735 

Ala Pro Arg Gly Pro Gly Ser Leu 
30 740 

(2) INFORMATION FOR SEQ ID NO: 2 84: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 762 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 284: 
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Met Val Glu Pro Ser Ser Pro Gly Trp Trp Arg Ala Ser Leu Ser Arg 

15 10 15 

Leu Thr Met Gin Ala Trp Tyr Val Arg Ala Arg Ala Arg Ala Phe Thr 
20 25 30 

5 Arg Arg Arg Val Ser Ser Ser Asp Ser Arg Ala Ser Ser Ser Val Met 
35 40 45 

Gly Ala Gly Lys Ser Ala Leu Thr Thr Ala Arg Ala Ser Cys Ser Arg 

50 55 60 

Gly Ser Xaa Ser Glu Gly Gly Ala Ala Ala Arg lie lie Ser Tyr Cys 
10 65 70 75 80 

Cys Ser Ser Gly Arg Val Pro Gin Pro His Ser Thr Pro Ser Arg Asp 

85 90 95 

Ala lie Pro Glu His Arg Ser Ala Pro Ala Phe Pro His Pro Thr Pro 
100 105 110 

15 Ser Gly Phe Ala Gly Ala Met Gly Thr Glu Asp Cys Asp His Glu Gly 
115 120 125 

Arg Ser Val Ala Ala Pro Val Glu Val Met Ala Leu Tyr Ala Thr Asp 

130 135 140 

Gly Cys Val lie Thr Ser Ser Leu Ala Leu Leu Thr Asn Cys Leu Leu 
20 145 150 155 160 

Gly Ala Glu Pro Leu Tyr lie Phe Ser Tyr Asp Ala Tyr Arg Pro Asp 

165 170 175 

Ala Pro Asn Gly Pro Thr Gly Ala Pro Thr Glu Gin Glu Arg Phe Glu 
180 185 190 

25 Gly Ser Arg Ala Leu Tyr Arg Asp Ala Gly Gin Gly Asp Ser Phe Arg 
195 200 205 

Val Thr Phe Cys Leu Leu Gly Thr Glu Val Gly Val Thr His His Pro 

210 215 220 

Lys Gly Arg Trp Met Phe Val Cys Arg Phe Glu Arg Ala Asp Asp Val 
30 225 230 235 240 

Ala Val Leu Gin Asp Ala Leu Gly Arg Gly Thr Pro Leu Leu Pro Ala 

245 250 255 

His He Thr Ala Thr Leu Asp Leu Glu Ala Thr Phe Ala Leu His Ala 
260 265 270 

35 Asn He He Met Ala Leu Thr Val Ala He Val His Asn Ala Pro Ala 
275 280 285 

Arg He Gly Ser Gly Ser Thr Ala Pro Leu Tyr Glu Pro Gly Glu Ser 

290 295 300 

Met Arg Ser Val Val Gly Arg Met Ser Leu Gly Gin Arg Gly Leu Thr 
40 305 310 315 320 

Thr Leu Phe Val His His Glu Ala Arg Val Leu Ala Ala Tyr Arg Arg 

325 330 335 

Ala Tyr Tyr Gly Ser Ala Gin Ser Pro Phe Trp Phe Leu Ser Lys Phe 
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340 345 350 

Gly Pro Asp Glu Lys Ser Leu Val Leu Ala Ala Arg Tyr Tyr Val Leu 

355 360 365 

Gin Ala Pro Arg Leu Gly Gly Ala Gly Ala Thr Tyr Asp Leu Gin Ala 
5 370 375 380 

Val Lys Asp lie Cys Ala Thr Tyr Ala lie Pro His Asp Pro Arg Pro 
385 390 395 400 

Asp Thr Leu Ser Ala Ala Ser Leu Thr Ser Phe Ala Ala lie Thr Arg 
405 410 415 

10 Phe Cys Cys Thr Ser Gin Tyr Ser Arg Gly Ala Ala Ala Ala Gly Phe 
420 425 430 

Pro Leu Tyr Val Glu Arg Arg lie Ala Ala Asp Val Arg Glu Thr Gly 

435 440 445 

Ala Leu Glu Lys Phe lie Ala His Asp Arg Ser Cys Leu Arg Val Ser 
15 450 455 460 

Asp Arg Glu Phe lie Thr Tyr He Tyr Leu Ala His Phe Glu Cys Phe 
465 470 475 480 

Ser Pro Pro Arg Leu Ala Thr His Leu Arg Ala Val Thr Thr His Asp 
485 490 495 

20 Pro Ser Pro Ala Ala Ser Thr Glu Gin Pro Ser Pro Leu Gly Arg Glu 
500 505 510 

Ala Val Glu Gin Phe Phe Arg His Val Arg Ala Gin Leu Asn He Arg 

515 520 525 

Glu Tyr Val Lys Gin Asn Val Thr Pro Arg Glu Thr Ala Gly Asp Ala 
25 530 535 540 

Ala Ala Ala Tyr Leu Arg Ala Arg Thr Tyr Ala Pro Ala Ala Leu Thr 
545 550 555 560 

Pro Ala Pro Ala Tyr Cys Gly Val Ala Asp Ser Ser Thr Lys Met Met 
565 570 575 

30 Gly Arg Leu Ala Glu Ala Glu Arg Leu Leu Val Pro His Gly Trp Pro 
580 585 590 

Ala Phe Ala Pro Thr Thr Pro Gly Asp Asp Ala Gly Gly Gly Thr Ala 

595 600 605 

Ala Pro Gin Thr Cys Gly He Val Lys Arg Leu Leu Lys Leu Ala Ala 
35 610 615 620 

Thr Glu Gin Gin Gly Thr Thr Pro Pro Ala He Ala Ala Leu Met Gin 
625 630 635 640 

Asp Ala Ser Val Gin Thr Pro Leu Pro Val Tyr Arg He Thr Met Ser 
645 650 655 

40 Pro Thr Gly Gin Ala Phe Ala Ala Ala Ala Arg Asp Asp Trp Ala Arg 
660 665 670 

Val Thr Arg Asp Ala Arg Pro Pro Glu Ala Thr Val Val Ala Asp Ala 
675 680 685 
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Ala Ala Ala Pro Glu Pro Gly Ala Leu Gly Arg Arg Leu Thr Arg Arg 

690 . 695 ■ 700 

lie Cys Arg Pro Ala Pro Pro, Pro Gly Arg Pro Gly Arg Arg Gly Pro 
705 710 715 720 

5 Asp Val Arg Glu Pro Gin Arg Asp Leu Gin Arg Arg Ala Gly Arg Tyr 

725 730 735 

Glu His His Pro Gly Ser Gly His Arg Pro Glu Gly Ala Arg Pro Leu 

740 745 750 

Ser Pro Ala Pro. Arg Gly Pro Gly Ser Leu 
10 755 760 

(2) INFORMATION FOR SEQ ID NO: 285: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 781 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285: 

Met His Val Ser Ala Arg Arg Arg lie Leu Ser Arg Cys Ala Ala Thr 
25 1 5 10 15 

Ala Pro Ser Met Val Glu Pro Ser Ser Pro Gly Trp Trp Arg Ala Ser 

20 25 30 

Leu Ser Arg Leu Thr Met Gin Ala Trp Tyr Val Arg Ala Arg Ala Arg 
35 40 45 

30 Ala Phe Thr Arg Arg Arg Val Ser Ser Ser Asp Ser Arg Ala Ser Ser 
50 55 60 

Ser Val Met Gly Ala Gly Lys Ser Ala Leu Thr Thr Ala Arg Ala Ser 
65 70 75 80 

Cys Ser Arg Gly Ser Xaa Ser Glu Gly Gly Ala Ala Ala Arg He He 
35 85 90 95 

Ser Tyr Cys Cys Ser Ser Gly Arg Val Pro Gin Pro His Ser Thr Pro 

100 105 110 

Ser Arg Asp Ala He Pro Glu His Arg Ser Ala Pro Ala Phe Pro His 
115 120 125 

40 Pro Thr Pro Ser Gly Phe Ala Gly Ala Met Gly Thr Glu Asp Cys Asp 
130 135 140 

His Glu Gly Arg Ser Val Ala Ala Pro Val Glu Val Met Ala Leu Tyr 
145 150 155 160 
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Ala Thr Asp Gly Cys Val He Thr Ser Ser Leu Ala Leu Leu Thr Asn 

165 170 175 

Cys Leu Leu Gly Ala Glu Pro Leu Tyr He Phe Ser Tyr Asp Ala Tyr 
180 185 190 

5 Arg Pro Asp Ala Pro Asn Gly Pro Thr Gly Ala Pro Thr Glu Gin Glu 
195 200 205 

Arg Phe Glu Gly Ser Arg Ala Leu Tyr Arg Asp Ala Gly Gin Gly Asp 

210 215 220 

Ser Phe Arg Val Thr Phe Cys Leu Leu Gly Thr Glu Val Gly Val Thr 
10 225 230 235 240 

His His Pro Lys Gly Arg Trp Met Phe Val Cys Arg Phe Glu Arg Ala 

245 250 255 

Asp Asp Val Ala Val Leu Gin Asp Ala Leu Gly Arg Gly Thr Pro Leu 
260 265 270 

15 Leu Pro Ala His He Thr Ala Thr Leu Asp Leu Glu Ala Thr Phe Ala 
275 280 285 

Leu His Ala Asn He He Met Ala Leu Thr Val Ala He Val His Asn 

290 295 300 

Ala Pro Ala Arg He Gly Ser Gly Ser Thr Ala Pro Leu Tyr Glu Pro 
20 305 310 315 320 

Gly Glu Ser Met Arg Ser Val Val Gly Arg Met Ser Leu Gly Gin Arg 

325 330 335 

Gly Leu Thr Thr Leu Phe Val His His Glu Ala Arg Val Leu Ala Ala 
340 345 350 

25 Tyr Arg Arg Ala Tyr Tyr Gly Ser Ala Gin Ser Pro Phe Trp Phe Leu 
355 360 365 

Ser Lys Phe Gly Pro Asp Glu Lys Ser Leu Val Leu Ala Ala Arg Tyr 

370 375 380 

Tyr Val Leu Gin Ala Pro Arg Leu Gly Gly Ala Gly Ala Thr Tyr Asp 
30 385 390 395 400 

Leu Gin Ala Val Lys Asp He Cys Ala Thr Tyr Ala He Pro His Asp 

405 410 415 

Pro Arg Pro Asp Thr Leu Ser Ala Ala Ser Leu Thr Ser Phe Ala Ala 
420 425 430 

35 He Thr Arg Phe Cys Cys Thr Ser Gin Tyr Ser Arg Gly Ala Ala Ala 
435 440 445 

Ala Gly Phe Pro Leu Tyr Val Glu Arg Arg He Ala Ala Asp Val Arg 

450 455 460 

Glu Thr Gly Ala Leu Glu Lys Phe He Ala His Asp Arg Ser Cys Leu 
40 465 470 475 480 

Arg Val Ser Asp Arg Glu Phe He Thr Tyr He Tyr Leu Ala His Phe 

485 490 495 

Glu Cys Phe Ser Pro Pro Arg Leu Ala Thr His Leu Arg Ala Val Thr 
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• 500 505 510 

Thr His Asp Pro Ser Pro Ala Ala Ser Thr Glu Gin Pro Ser Pro Leu 

515 520 525 

Gly Arg Glu Ala Val Glu Gin Phe Phe Arg His Val Arg Ala Gin Leu 
5 530 535 540 

Asn lie Arg Glu Tyr Val Lys Gin Asn Val Thr Pro Arg Glu Thr Ala 
545 550 555 560 

Gly Asp Ala Ala Ala Ala Tyr Leu Arg Ala Arg Thr Tyr Ala Pro Ala 
565 570 575 

10 Ala Leu Thr Pro Ala Pro Ala Tyr Cys Gly Val Ala Asp Ser Ser Thr 
580 585 590 

Lys Met Met Gly Arg Leu Ala Glu Ala Glu Arg Leu Leu Val Pro His 

595 600 605 

Gly Trp Pro Ala Phe Ala Pro Thr Thr Pro Gly Asp Asp Ala Gly Gly 
15 610 615 620 

Gly Thr Ala Ala Pro Gin Thr Cys Gly lie Val Lys Arg Leu Leu Lys 
625 630 635 640 

Leu Ala Ala Thr Glu Gin Gin Gly Thr Thr Pro Pro Ala lie Ala Ala 
645 650 655 

20 Leu Met Gin Asp Ala Ser Val Gin Thr Pro Leu Pro Val Tyr Arg lie 
660 665 670 

Thr Met Ser Pro Thr Gly Gin Ala Phe Ala Ala Ala Ala Arg Asp Asp 

67.5 680 685 

Trp Ala Arg Val Thr Arg Asp Ala Arg Pro Pro Glu Ala Thr Val Val 
25 690 695 700 

Ala Asp Ala Ala Ala Ala Pro Glu Pro Gly Ala Leu Gly Arg Arg Leu 
705 710 715 720 

Thr Arg Arg' He Cys Arg Pro Ala. Pro Pro Pro Gly Arg Pro Gly Arg 
725 730 735 

30 Arg Gly Pro Asp Val Arg Glu Pro Gin Arg Asp Leu Gin Arg Arg Ala 
740 745 750 

Gly Arg Tyr Glu His His Pro Gly Ser Gly His Arg Pro Glu Gly Ala 

755 760 765 

Arg Pro Leu Ser Pro Ala Pro Arg Gly Pro Gly Ser Leu 
35 770 775 780 

(2) INFORMATION FOR SEQ ID NO: 286 : 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 784 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 286 : 

5 

Met Val Ala Met His Val Ser Ala Arg Arg Arg lie Leu Ser Arg Cys 

15 10 15 

Ala Ala Thr Ala Pro Ser Met Val Glu Pro Ser Ser Pro Gly Trp Trp 
20 25 30 

10 Arg Ala Ser Leu Ser Arg Leu Thr Met Gin Ala Trp Tyr Val Arg Ala 
35 40 45 

Arg Ala Arg Ala Phe Thr Arg Arg Arg Val Ser Ser Ser Asp Ser Arg 

50 55 60 

Ala Ser Ser Ser Val Met Gly Ala Gly Lys Ser Ala Leu Thr Thr Ala 
15 65 70 75 80 

Arg Ala Ser Cys Ser Arg Gly Ser Xaa Ser Glu Gly Gly Ala Ala Ala 

85 90 95 

Arg lie lie Ser Tyr Cys Cys Ser Ser Gly Arg Val Pro Gin Pro His 
100. 105 110 

20 Ser Thr Pro Ser Arg Asp Ala lie Pro Glu His Arg Ser Ala Pro Ala 
115 120 125 

Phe Pro His Pro Thr Pro Ser Gly Phe Ala Gly Ala Met Gly Thr Glu 

130 135 140 

Asp Cys Asp His Glu Gly Arg Ser Val Ala Ala Pro Val Glu Val Met 
25 145 150 155 160 

Ala Leu Tyr Ala Thr Asp Gly Cys Val lie Thr Ser Ser Leu Ala Leu 

165 170 175 

Leu Thr Asn Cys Leu Leu Gly Ala Glu Pro Leu Tyr lie Phe Ser Tyr 
180 185 190 

30 Asp Ala Tyr Arg Pro Asp Ala Pro Asn Gly Pro Thr Gly Ala Pro Thr 
195 200 205 

Glu Gin Glu Arg Phe Glu Gly Ser Arg Ala Leu Tyr Arg Asp Ala Gly 

210 215 220 

Gin Gly Asp Ser Phe Arg Val Thr Phe Cys Leu Leu Gly Thr Glu Val 
35 225 230 235 240 

Gly Val Thr His His Pro Lys Gly Arg Trp Met Phe Val Cys Arg Phe 

245 250 255 

Glu Arg Ala Asp Asp Val Ala Val Leu Gin Asp Ala Leu Gly Arg- Gly 
260 265 270 

40 Thr Pro Leu Leu Pro Ala His He Thr Ala Thr Leu Asp Leu Glu Ala 
275 280 285 

Thr Phe Ala Leu His Ala Asn He He Met Ala Leu Thr Val Ala He 
290 295 300 
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Val His Asn Ala Pro Ala Arg lie Gly Ser Gly Ser Thr Ala Pro Leu 
305 310 315 320 

Tyr Glu Pro Gly Glu Ser Met JVrg Ser Val Val Gly Arg Met Ser Leu 
325 330 335 

5 Gly Gin Arg Gly Leu Thr Thr Leu Phe Val His His Glu Ala Arg Val 
340 345 350 

Leu Ala Ala Tyr Arg Arg Ala Tyr Tyr Gly Ser Ala Gin Ser Pro Phe 

355 360 365 

Trp Phe Leu Ser Lys Phe Gly Pro Asp Glu Lys Ser Leu Val Leu Ala 
10 370 375 380 

Ala Arg Tyr Tyr Val Leu Gin Ala Pro Arg Leu Gly Gly Ala Gly Ala 
385 390 395 400 

Thr Tyr Asp Leu Gin Ala Val Lys Asp lie Cys Ala Thr Tyr Ala lie 
405 410 415 

15 Pro His Asp Pro Arg Pro Asp Thr Leu Ser Ala Ala Ser Leu Thr Ser 
420 425 430 

Phe Ala Ala lie Thr Arg Phe Cys Cys Thr Ser Gin Tyr Ser Arg Gly 

435 440 445 

Ala Ala Ala Ala Gly Phe Pro Leu Tyr Val Glu Arg Arg lie Ala Ala 
20 450 455 460 

Asp Val Arg Glu Thr Gly Ala Leu Glu Lys Phe lie Ala His Asp Arg 
465 470 475 480 

Ser Cys Leu Arg Val Ser Asp Arg Glu Phe lie Thr Tyr lie Tyr Leu 
485 490 495 

25 Ala His Phe Glu Cys Phe Ser Pro Pro Arg Leu Ala Thr His Leu Arg 
500 505 510 

Ala Val Thr Thr His Asp Pro Ser Pro Ala Ala Ser Thr Glu Gin Pro 

515 * 520 525 

Ser Pro Leu Gly Arg Glu Ala Val Glu Gin Phe Phe Arg His Val Arg 
30 530 535 540 

Ala Gin Leu Asn lie Arg Glu Tyr Val Lys Gin Asn Val Thr Pro Arg 
545 550 555 560 

Glu Thr Ala Gly Asp Ala Ala Ala Ala Tyr Leu Arg Ala Arg Thr Tyr 
565 570 575 

35 Ala Pro Ala Ala Leu Thr Pro Ala Pro Ala Tyr Cys Gly Val Ala Asp 
580 585 590 

Ser Ser Thr Lys Met Met Gly Arg Leu Ala Glu Ala Glu Arg Leu Leu 

595 600 605 

Val Pro His Gly Trp Pro Ala Phe Ala Pro Thr Thr Pro Gly Asp Asp 
40 610 615 620 

Ala Gly Gly Gly Thr Ala Ala Pro Gin Thr Cys Gly lie Val Lys Arg 
625 630 635 640 

Leu Leu Lys Leu Ala Ala Thr Glu Gin Gin Gly Thr Thr Pro Pro Ala 
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645 650 655 

He Ala Ala Leu Met Gin Asp Ala Ser Val Gin Thr Pro Leu Pro Val 

660 665 670 

Tyr Arg He Thr Met Ser Pro Thr Gly Gin Ala Phe Ala Ala Ala Ala 
5 675 680 685 

Arg Asp Asp Trp Ala Arg Val Thr Arg Asp Ala Arg Pro Pro Glu Ala 

690 695 700 

Thr Val Val Ala Asp Ala Ala Ala Ala Pro Glu Pro Gly Ala Leu Gly 
705 710 715 720 

10 Arg Arg Leu Thr Arg Arg He Cys Arg Pro Ala Pro Pro Pro Gly Arg 

725 730 735 

Pro Gly Arg Arg Gly Pro Asp Val Arg Glu Pro Gin Arg Asp Leu Gin 

740 745 750 

Arg Arg Ala Gly Arg Tyr Glu His His Pro Gly Ser Gly His Arg Pro 
15 755 760 765 

Glu Gly Ala Arg Pro Leu Ser Pro Ala Pro Arg Gly Pro Gly Ser Leu 
770 775 780 



20 



(2) INFORMATION FOR SEQ ID NO: 287: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 

30 

Met Tyr He Cys Arg Met Val Ala Met His Val Ser Ala Arg Arg Arg 

15 10 15 

He Leu Ser Arg Cys Ala Ala Thr Ala Pro Ser Met Val Glu Pro Ser 
20 25 30 

35 Ser Pro Gly Trp Trp Arg Ala Ser Leu Ser Arg Leu Thr Met Gin Ala 
35 40 45 

Trp Tyr Val Arg Ala Arg Ala Arg Ala Phe Thr Arg Arg Arg Val Ser 

50 55 60 

Ser Ser Asp Ser Arg Ala Ser Ser Ser Val Met Gly Ala Gly Lys Ser 
40 65 70 75 80 

Ala Leu Thr Thr Ala Arg Ala Ser Cys Ser Arg Gly Ser Xaa Ser Glu 

85 90 95 

Gly Gly Ala Ala Ala Arg He He Ser Tyr Cys Cys Ser Ser Gly Arg 
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100 105 110 

Val Pro Gin Pro His Ser Thr Pro Ser Arg Asp Ala lie Pro Glu His 

115. _120 125 

Arg Ser Ala Pro Ala Phe Pro His Pro Thr Pro Ser Gly Phe Ala Gly 
5 130 135 140 

Ala Met Gly Thr Glu Asp Cys Asp His Glu Gly Arg Ser Val Ala Ala 
145 150 155 160 

Pro Val Glu Val Met Ala Leu Tyr Ala Thr Asp Gly Cys Val lie Thr 
165 170 175 

10 Ser Ser Leu Ala Leu Leu Thr Asn Cys Leu Leu Gly Ala Glu Pro Leu 
180 185 190 

Tyr lie Phe Ser Tyr Asp Ala Tyr Arg Pro Asp Ala Pro Asn Gly Pro 

195 200 205 

Thr Gly Ala Pro Thr Glu Gin Glu Arg Phe Glu Gly Ser Arg Ala Leu 
15 210 215 220 

Tyr Arg Asp Ala Gly Gin Gly Asp Ser Phe Arg Val Thr Phe Cys Leu 
225 230 235 240 

Leu Gly Thr Glu Val Gly Val Thr His His Pro Lys Gly Arg Trp Met 
245 250 255 

20 Phe Val Cys Arg Phe Glu Arg Ala Asp Asp Val Ala Val Leu Gin Asp 
260 265 270 

Ala Leu Gly Arg Gly Thr Pro Leu Leu Pro Ala His He Thr Ala Thr 

275 280 285 

Leu Asp Leu Glu Ala Thr Phe Ala Leu His Ala Asn He He Met Ala 
25 290 295 300 

Leu Thr Val Ala He Val His Asn Ala Pro Ala Arg lie Gly Ser Gly 
305 310 315 320 

Ser Thr Ala Pro Leu Tyr Glu Pro Gly Glu Ser' Met Arg Ser Val Val 
325 330 335 

30 Gly Arg Met Ser Leu Gly Gin Arg Gly Leu Thr Thr Leu Phe Val His 
340 345 350 

His Glu Ala Arg Val Leu Ala Ala Tyr Arg Arg Ala Tyr Tyr Gly Ser 

355 360 365 

Ala Gin Ser Pro Phe Trp Phe Leu Ser Lys Phe Gly Pro Asp Glu Lys 
35 370 375 380 

Ser Leu Val Leu Ala Ala Arg Tyr Tyr Val Leu Gin Ala Pro Arg Leu 
385 390 395 400 

Gly Gly Ala Gly Ala Thr Tyr Asp Leu Gin Ala Val Lys Asp He Cys 
405 410 415 

40 Ala Thr Tyr Ala He Pro His Asp Pro Arg Pro Asp Thr Leu Ser Ala 
420 425 430 

Ala Ser Leu Thr Ser Phe Ala Ala He Thr Arg Phe Cys Cys Thr Ser 
435 440 445 
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Gin Tyr Ser Arg Gly Ala Ala Ala Ala Gly Phe Pro Leu Tyr Val Glu 

450 455 460 

Arg Arg lie Ala Ala Asp Val_ Arg Glu Thr Gly Ala Leu Glu Lys Phe 
465 470 475 480 

5 lie Ala His Asp Arg Ser Cys Leu Arg Val Ser Asp Arg Glu Phe lie 

485 490 495 

Thr Tyr lie Tyr Leu Ala His Phe Glu Cys Phe Ser Pro Pro Arg Leu 

500 505 510 

Ala Thr His Leu Arg Ala Val Thr Thr His Asp Pro Ser Pro Ala Ala 
10 515 520 525 

Ser Thr Glu Gin Pro Ser Pro Leu Gly Arg Glu Ala Val Glu Gin Phe 

530 535 540 

Phe Arg His Val Arg Ala Gin Leu Asn lie Arg Glu Tyr Val Lys Gin 
545 550 555 560 

15 Asn Val Thr Pro Arg Glu Thr Ala Gly Asp Ala Ala Ala Ala Tyr Leu 

565 570 575 

Arg Ala Arg Thr Tyr Ala Pro Ala Ala Leu Thr Pro Ala Pro Ala Tyr 

580 585 590 

Cys Gly Val Ala Asp Ser Ser Thr Lys Met Met Gly Arg Leu Ala Glu 
20 595 600 605 

Ala Glu Arg Leu Leu Val Pro His Gly Trp Pro Ala Phe Ala Pro Thr 

610 615 620 

Thr Pro Gly Asp Asp Ala Gly Gly Gly Thr Ala Ala Pro Gin Thr Cys 
625 630 635 640 

25 Gly lie Val Lys Arg Leu Leu Lys Leu Ala Ala Thr Glu Gin Gin Gly 

645 650 655 

Thr Thr Pro Pro Ala lie Ala Ala Leu Met Gin Asp Ala Ser Val Gin 

660 665 ' 670 

Thr Pro Leu Pro Val Tyr Arg lie Thr Met Ser Pro Thr Gly Gin Ala 
30 675 680 685 

Phe Ala Ala Ala Ala Arg Asp Asp Trp Ala Arg Val Thr Arg Asp Ala 

690 695 700 

Arg Pro Pro Glu Ala Thr Val Val Ala Asp Ala Ala Ala Ala Pro Glu 
705 710 715 720 

35 Pro Gly Ala Leu Gly Arg Arg Leu Thr Arg Arg lie Cys Arg Pro Ala 

725 730 735 

Pro Pro Pro Gly Arg Pro Gly Arg Arg Gly Pro Asp Val Arg Glu Pro 

740 745 750 

Gin Arg Asp Leu Gin Arg Arg Ala Gly Arg Tyr Glu His His Pro Gly 
40 755 760 765 

Ser Gly His Arg Pro Glu Gly Ala Arg Pro Leu Ser Pro Ala Pro Arg 

770 775 780 

Gly Pro Gly Ser Leu 

706 



WO 98/20016 



PCT/US97/20016 



10 



785 



(2) INFORMATION FOR SEQ ID NO: 288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 809 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 88: 

15 Met Leu Arg Met Ala Trp Glu Thr Ser Thr Ser Ala Asp Leu Ser Ala 
15 10 15 

Ala Pro Thr Asp Met Tyr He Cys Arg Met Val Ala Met His Val Ser 

20 25 30 

Ala Arg Arg Arg He Leu Ser Arg Cys Ala Ala Thr Ala Pro Ser Met 
20 35 40 45 

Val Glu Pro Ser Ser Pro Gly Trp Trp Arg Ala Ser Leu Ser Arg Leu 

50 55 . 60 

Thr Met Gin Ala Trp Tyr Val Arg Ala Arg Ala Arg Ala Phe Thr Arg 
65 70 75 80 

25 Arg Arg Val Ser Ser Ser Asp Ser Arg Ala Ser Ser Ser Val Met Gly 
.85 90 95 

Ala Gly Lys Ser Ala Leu Thr Thr Ala Arg Ala Ser Cys Ser Arg Gly 

100 105 110 

Ser Xaa Ser Glu Gly Gly Ala Ala Ala Arg He He Ser Tyr Cys Cys 
30 115 120 125 

Ser Ser Gly Arg Val Pro Gin Pro His Ser Thr Pro Ser Arg Asp Ala 

130 135 140 

He Pro Glu His Arg Ser Ala Pro Ala Phe Pro His Pro Thr Pro Ser 
145 150 155 160 

35 Gly Phe Ala Gly Ala Met Gly Thr Glu Asp Cys Asp His Glu Gly Arg 

165 170 175 

Ser Val Ala Ala Pro Val Glu Val Met Ala Leu Tyr Ala Thr Asp Gly 

180 185 190 

Cys Val He Thr Ser Ser Leu Ala Leu Leu Thr Asn Cys Leu Leu Gly 
40 195 200 205 

Ala Glu Pro Leu Tyr lie Phe Ser Tyr Asp Ala Tyr Arg Pro Asp Ala 

210 215 220 

Pro Asn Gly Pro Thr Gly Ala Pro Thr Glu Gin Glu Arg Phe Glu Gly 
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225 230 235 240 

Ser Arg Ala Leu Tyr Arg Asp Ala Gly Gin Gly Asp Ser Phe Arg Val 

245 mm 250 255 

Thr Phe Cys Leu Leu Gly Thr Glu Val Gly Val Thr His His Pro Lys 
5 260 265 270 

Gly Arg Trp Met Phe Val Cys Arg Phe Glu Arg Ala Asp Asp Val Ala 

275 280 285 

Val Leu Gin Asp Ala Leu Gly Arg Gly Thr Pro Leu Leu Pro Ala His 
290 295 300 

10 He Thr Ala Thr Leu Asp Leu Glu Ala Thr Phe Ala Leu His Ala Asn 
305 310 315 320 

lie lie Met Ala Leu Thr Val Ala He Val His Asn Ala Pro Ala Arg 

325 330 335 

He Gly Ser Gly Ser Thr Ala Pro Leu Tyr Glu Pro Gly Glu Ser Met 
15 340 345 350 

Arg Ser Val Val Gly Arg Met Ser Leu Gly Gin Arg Gly Leu Thr Thr 

355 360 365 

Leu Phe Val His His Glu Ala Arg Val Leu Ala Ala Tyr Arg Arg Ala 
370 375 380 

20 Tyr Tyr Gly Ser Ala Gin Ser Pro Phe Trp Phe Leu Ser Lys Phe Gly 
385 390 395 400 

Pro Asp Glu Lys Ser Leu Val Leu Ala Ala Arg Tyr Tyr Val Leu Gin 

405 410 415 

Ala Pro Arg Leu Gly Gly Ala Gly Ala Thr Tyr Asp Leu Gin Ala Val 
25 420 425 430 

Lys Asp He Cys Ala Thr Tyr Ala He Pro His Asp Pro Arg Pro Asp 

435 440 445 

Thr Leu Ser Ala Ala Ser Leu Thr Ser Phe Ala Ala He Thr Arg Phe 
450 455 460 

30 Cys Cys Thr Ser Gin Tyr Ser Arg Gly Ala Ala Ala Ala Gly Phe Pro 
465 470 475 480 

Leu Tyr Val Glu Arg Arg He Ala Ala Asp Val Arg Glu Thr Gly Ala 

485 490 495 

Leu Glu Lys Phe He Ala His Asp Arg Ser Cys Leu Arg Val Ser Asp 
35 500 505 510 

Arg Glu Phe He Thr Tyr He Tyr Leu Ala His Phe Glu Cys Phe Ser 

515 520 525 

Pro Pro Arg Leu Ala Thr His Leu Arg Ala Val Thr Thr His Asp Pro 
530 535 540 

40 Ser Pro. Ala Ala Ser Thr Glu Gin Pro Ser Pro Leu Gly Arg Glu Ala 
545 550 555 560 

Val Glu. Gin Phe Phe Arg His Val Arg Ala Gin Leu Asn He Arg Glu 
565 570 575 
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Tyr Val Lys Gin Asn Val Thr Pro Arg Glu Thr Ala Gly Asp Ala Ala 

580 585 590 

Ala Ala Tyr Leu Arg Ala Arg, Thr Tyr Ala Pro Ala Ala Leu Thr Pro 
595 600 605 

5 Ala Pro Ala Tyr Cys Gly Val Ala Asp Ser Ser Thr Lys Met Met Gly 
610 615 620 

Arg Leu Ala Glu Ala Glu Arg Leu Leu Val Pro His Gly Trp Pro Ala 
625 630 635 640 

Phe Ala Pro Thr Thr Pro Gly Asp Asp Ala Gly Gly Gly Thr Ala Ala 
10 645 650 655 

Pro Gin Thr Cys Gly lie Val Lys Arg Leu Leu Lys Leu Ala Ala Thr 

660 665 670 

Glu Gin Gin Gly Thr Thr Pro Pro Ala lie Ala Ala Leu Met Gin Asp 
675 680 685 

15 Ala Ser Val Gin Thr Pro Leu Pro Val Tyr Arg lie Thr Met Ser Pro 
690 695 700 

Thr Gly Gin Ala Phe Ala Ala Ala Ala Arg Asp Asp Trp Ala Arg Val 
705 710 715 720 

Thr Arg Asp Ala Arg Pro Pro Glu Ala Thr Val Val Ala Asp Ala Ala 
20 725 730 735 

Ala Ala Pro Glu Pro Gly Ala Leu Gly Arg Arg Leu Thr Arg Arg lie 

740 745 750 

Cys Arg Pro Ala Pro Pro Pro Gly Arg Pro Gly Arg Arg Gly Pro Asp 
755 760 765 

25 Val Arg Glu Pro Gin Arg Asp Leu Gin Arg Arg Ala Gly Arg Tyr Glu 
770 775 780 

His His Pro Gly Ser Gly His Arg Pro Glu Gly Ala Arg Pro Leu Ser 
785 790 795 800 

Pro Ala Pro Arg Gly Pro Gly Ser Leu 
30 805 

(2) INFORMATION FOR SEQ ID NO: 289: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 816 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 289: 
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Met Thr Thr Ser Leu Ser Ala Met Leu Arg Met Ala Trp Glu Thr Ser 

15 10 15 

Thr Ser Ala Asp Leu Ser Ala. Ala Pro Thr Asp Met Tyr He Cys Arg 
20 25 30 

5 Met Val Ala Met His Val Ser Ala Arg Arg Arg He Leu Ser Arg Cys 
35 40 45 

Ala Ala Thr Ala Pro Ser Met Val Glu Pro Ser Ser Pro Gly Trp Trp 

50 55 60 

Arg Ala Ser Leu Ser Arg Leu Thr Met Gin Ala Trp Tyr Val Arg Ala 
10 65 70 75 80 

Arg Ala Arg Ala Phe Thr Arg Arg Arg Val Ser Ser Ser Asp Ser Arg 

85 90 95 

Ala Ser Ser Ser Val Met Gly Ala Gly Lys Ser Ala Leu Thr Thr Ala 
100 105 110 

15 Arg Ala Ser Cys Ser Arg Gly Ser Xaa Ser Glu Gly Gly Ala Ala Ala 
115 120 125 

Arg He He Ser Tyr Cys Cys Ser Ser Gly Arg Val Pro Gin Pro His 

130 135 140 

Ser Thr Pro Ser Arg Asp Ala He Pro Glu His Arg Ser Ala Pro Ala 
20 145 150 155 160 

Phe Pro His Pro Thr Pro Ser Gly Phe Ala Gly Ala Met Gly Thr Glu 

165 170 175 

Asp Cys Asp His Glu Gly Arg Ser Val Ala Ala Pro Val Glu Val Met 
180 185 190 

25 Ala Leu Tyr Ala Thr Asp Gly Cys Val lie Thr Ser Ser Leu Ala Leu 
195 200 205 

Leu Thr Asn Cys Leu Leu Gly Ala Glu Pro Leu Tyr He Phe Ser Tyr 

210 215 220 

Asp Ala Tyr Arg Pro Asp Ala Pro Asn Gly Pro Thr Gly Ala Pro Thr 
30 225 230 235 240 

Glu Gin Glu Arg Phe Glu Gly Ser Arg Ala Leu Tyr Arg Asp Ala Gly 

245 250 255 

Gin Gly Asp Ser Phe Arg Val Thr Phe Cys Leu Leu Gly Thr Glu Val 
260 265 270 

35 Gly Val Thr His His Pro Lys Gly Arg Trp Met Phe Val Cys Arg Phe 
275 280 285 

Glu Arg Ala Asp Asp Val Ala Val Leu Gin Asp Ala Leu Gly Arg Gly 

290 295 300 

Thr Pro Leu Leu Pro Ala His He Thr Ala Thr Leu Asp Leu Glu Ala 
40 305 310 315 320 

Thr Phe Ala Leu His Ala Asn He He Met Ala Leu Thr Val Ala He 

325 330 335 

Val His Asn Ala Pro Ala Arg He Gly Ser Gly Ser Thr Ala Pro Leu 
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340 345 350 

Tyr Glu Pro Gly Glu Ser Met Arg Ser Val Val Gly Arg Met Ser Leu 

355 .360 365 

Gly Gin Arg Gly Leu Thr Thr Leu Phe Val His His Glu Ala Arg Val 
5 370 375 380 

Leu Ala Ala Tyr Arg Arg Ala Tyr Tyr Gly Ser Ala Gin Ser Pro Phe 
385 390 395 400 

Trp Phe Leu Ser Lys Phe Gly Pro Asp Glu Lys Ser Leu Val Leu Ala 
405 410 415 

10 Ala Arg Tyr Tyr Val Leu Gin Ala Pro Arg Leu Gly Gly Ala Gly Ala 
420 425 430 

Thr Tyr Asp Leu Gin Ala Val Lys Asp lie Cys Ala Thr Tyr Ala lie 

435 440 445 

Pro His Asp Pro Arg Pro Asp Thr Leu Ser Ala Ala Ser Leu Thr Ser 
15 450 455 460 

Phe Ala Ala lie Thr Arg Phe Cys Cys Thr Ser Gin Tyr Ser Arg Gly 
465 470 475 480 

Ala Ala Ala Ala Gly Phe Pro Leu Tyr Val Glu Arg Arg He Ala Ala 
485 490 495 

20 Asp Val Arg Glu Thr Gly Ala Leu Glu Lys Phe He Ala His Asp Arg 
500 505 510 

Ser Cys Leu Arg Val Ser Asp Arg Glu Phe He Thr Tyr He Tyr Leu 

515 520 525 

Ala His Phe Glu Cys Phe Ser Pro Pro Arg Leu Ala Thr His Leu Arg 
25 530 535 540 

Ala Val Thr Thr His Asp Pro Ser Pro Ala Ala Ser Thr Glu Gin Pro 
545 550 555 560 

Ser Pro Leu Gly Arg Glu Ala Val Glu Gin Phe Phe Arg His Val Arg 
565 570 575 

30 Ala Gin Leu Asn He Arg Glu Tyr Val Lys Gin Asn Val Thr Pro Arg 
580 585 590 

Glu Thr Ala Gly Asp Ala Ala Ala Ala Tyr Leu Arg Ala Arg Thr Tyr 

595 600 605 

Ala Pro Ala Ala Leu Thr Pro Ala Pro Ala Tyr Cys Gly Val Ala Ala 
35 . 610 615 620 

Asp Ser Ser Thr Lys Met Met Gly Arg Leu Ala Glu Ala Glu Arg Leu 
625 630 635 640 

Leu Val Pro Gly Trp Pro Ala Phe Ala Pro Thr Thr Pro Gly Asp Asp 
645 650 655 

40 Ala Gly Gly Gly Thr Ala Ala Pro Gin Thr Cys Gly He Val Lys Arg 
660 665 670 

Leu Leu Lys Leu Ala Ala Thr Glu Gin Gin Gly Thr Thr Pro Pro Ala 
675 680 685 
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He Ala Ala Leu Met Gin Asp Ala Ser Val Gin Thr Pro Leu Pro Val 

690 695 700 

Tyr Arg He Thr Met Ser Pro. Thr Gly Gin Ala Phe Ala Ala Ala Ala 
705 710 715 720 

Arg Asp Asp Trp Ala Arg Val Thr Arg Asp Ala Arg Pro Pro Glu Ala 

725 730 735 

Thr Val Val Ala Asp Ala Ala Ala Ala Pro Glu Pro Gly Ala Leu Gly 

740 745 750 

Arg Arg Leu Thr Arg Arg He Cys Arg Pro Ala Pro Pro Pro Gly Arg 

755 760 765 

Pro Gly Arg Arg Gly Pro Asp Val Arg Glu Pro Gin Arg Asp Leu Gin 

770 775 780 

Arg Arg Ala Gly Arg Tyr Glu His His Pro Gly Ser Gly His Arg Pro 
785 790 795 800 

Glu Gly Ala Arg Pro Leu Ser Pro Ala Pro Arg Gly Pro Gly Ser Leu 
805 810 815 



(2) INFORMATION FOR SEQ ID NO: 290: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 184 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:290: 



30 Met Thr Thr Thr Pro Leu Ser Asn Leu Phe Leu Arg Ala Pro Asp He 
15 10 15 

Thr His Val Ala Pro Pro Tyr Cys Leu Asn Ala Thr Trp Gin Ala Glu 

20 25 30 

Asn Ala Leu His Thr Thr Lys Thr Asp Pro Ala Cys Leu Ala Ala Arg 
35 35 40 45 

Ser Tyr Leu Val Arg Ala Ser Cys Ser Thr Ser Gly Pro He His Cys 

50 55 60 

Phe Phe Phe Ala Val Tyr Lys Asp Ser Gin His Ser Leu Pro Leu Val 
65 70 75 80 

40 Thr Glu Leu Arg Asn Phe Ala Asp Leu Val Asn His Pro Pro Val Leu 

85 90 95 

Arg Glu Leu Glu Asp Lys Arg Gly Gly Arg Leu Arg Cys Thr Gly Pro 
100 105 110 

712 



WO 98/20016 



PCT/US97/20016 



Phe Ser Cys Gly Thr lie Lys Asp Val Ser Gly Asp Ala Gly Glu Tyr 

115' 120 125 

Thr lie Asn Gly lie Val Tyr .His Cys His Cys Arg Tyr Pro Phe Ser 
130 135 140 

5 Lys Thr Cys Trp Leu Gly Ala Ser Ala Ala Leu Gin His Leu Arg Ser 
145 150 155 160 

lie Ser Ser Ser Gly Thr Ala Ala Arg . Ala Ala Glu Gin Arg Arg His 

165 170 175 

Lys lie Lys lie Lys lie Lys Val 
10 180 

(2) INFORMATION FOR SEQ ID NO:291: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:291: 

Met Trp Gly Pro Gly Pro Ala Arg Phe lie Ala Arg Pro Gly Thr His 
25 1 5 10 15 

Gly Arg Arg Val Phe Thr Asp Pro Pro Pro Arg Asn Met Thr Thr Thr 

20 25 30 

Pro Leu Ser Asn Leu Phe Leu Arg Ala Pro Asp lie Thr His Val Ala 
35 40 45 

30 Pro Pro Tyr Cys Leu Asn Ala Thr Trp Gin Ala Glu Asn Ala Leu His 
50 . 55 60 

Thr Thr Lys Thr Asp Pro Ala Cys Leu Ala Ala Arg Ser Tyr Leu Val 
65 70 75 80 

Arg Ala Ser Cys Ser Thr Ser Gly Pro lie His Cys Phe Phe Phe Ala 
35 85 90 95 

Val Tyr Lys Asp Ser Gin His Ser Leu Pro Leu Val Thr Glu Leu Arg 

100 105 110 

Asn Phe Ala Asp Leu Val Asn His Pro Pro Val Leu Arg Glu Leu Glu 
115 120 125 

40 Asp Lys Arg Gly Gly Arg Leu Arg Cys Thr Gly Pro Phe Ser Cys Gly 
130 135 140 

Thr He Lys Asp Val Ser Gly Asp Ala Gly Glu Tyr Thr He Asn Gly 
145 150 155 160 
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lie Val Tyr His Cys His Cys Arg Tyr Pro Phe Ser Lys Thr Cys Trp 

165 170 175 

Leu Gly Ala Ser Ala Ala Leu^.Gln His Leu Arg Ser lie Ser Ser Ser 
180 185 190 

5 Gly Thr Ala Ala Arg Ala Ala Glu Gin Arg Arg His Lys lie Lys He 
195 200 205 

Lys He Lys Val 
210 

10 (2) INFORMATION FOR SEQ ID NO: 2 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 670 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292: 

Met Ala Ala Gin Arg Ala Arg Ala Pro Ala Met Arg Thr Arg Gly Gly 

15 10 15 

Asp Ala Ala Leu Cys Ala Pro Glu Asp Gly Trp Val Lys Val His Pro 
25 20 25 30 

Thr Pro Gly Thr Met Leu Phe Arg Glu He Leu Leu Gly Gin Met Gly 

35 40 45 

Tyr Thr Glu Gly Gin Gly Val Tyr Asn Val Val Arg Ser Ser Glu Ala 
50 55 60 

30 Ala Thr Arg Gin Leu Gin Ala Ala He Phe His Ala Leu Leu Asn Ala 
65 70 75 80 

Thr Tyr Asp Leu Glu Glu Asp Trp Arg Arg His Val Val Arg Leu Gin 

85 90 95 

Pro Gin Arg Leu Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly Asp He 
35 100 105 110 

Ala Gly Val Ala Glu Arg Val Phe Asp Thr Trp Arg Cys Thr Leu Arg 

115 120 125 

Thr Thr Leu Leu Asp Phe Ala His Gly Val Val Asp Cys Phe Ala Pro 
130 135 140 

40 Gly Gly Pro Ser Gly Pro Thr Ser Phe Pro Lys Tyr lie Asp Trp Leu 
145 150 155 160 

Thr Cys Leu Gly Leu Val Pro lie Leu Arg Lys Thr Arg Glu Gly Glu 
165 170 175 
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Ala Thr Gin Arg Leu Gly Ala Phe Leu Arg Gin His Thr Leu Pro Arg 

180 185 190 

Gin Leu Ala Thr Val Ala Gly- Ala Ala Glu Arg Ala Gly Pro Gly Leu 
195 200 205 

5 Leu Glu Leu Ala Val Ala Phe Asp Ser Thr Arg Met Ala Glu Tyr Asp 
210 215 220 

Arg Val His He Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu Val Arg 
225 230 235 240 

Asp Pro Val Ser Gly Gin Arg Gly Glu Cys Leu Val Leu Cys Pro Pro 
10 245 250 255 

Leu Trp Thr Gly Asp Arg Leu Val Phe Asp Ser Pro Val Gin Arg Leu 

260 265 270 

Cys Pro Glu He Val Ala Cys His Ala Leu Arg Glu His Ala His He 
275 280 285 

15 Cys Arg Leu Arg Asn Thr Ala Ser Val Lys Val Leu Leu Gly Arg Lys 
290 295 300 

Ser Asp Ser Gly Val Ala Gly Ala Ala Ala Arg Val Val Asn Lys Ala 
305 310 315 320 

Leu Gly Glu Asp Asp Glu Thr Lys Ala Gly Ser Ala Ala Ser Arg Leu 
20 325 330 335 

Val Arg Leu He He Met Lys Gly Met Arg His Val Gly Asp He Asn 

340 345 350 

Asp Thr Val Arg Ala Tyr Leu Asp Glu Ala Gly Gly His Leu He Asp 
355 360 365 

25 Thr Pro Ala Val Asp His Thr Leu Pro Gly Phe Gly Lys Gly Gly Thr 
370 375 380 

Gly Arg Gly Ser Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin Gin Leu 
385 390 395 400 

Arg Gin Ala Phe Gin Thr Ala Val Val Asn Asn He Asn Gly Met Leu 
30 405 410 415 

Glu Gly Tyr He Asn Asn Leu Phe Gly Thr He Glu Arg Leu Arg Glu 

420 425 430 

Thr Asn Ala Gly Leu Ala Thr Gin Leu Gin Ala Arg Asp Arg Glu Leu 
435 440 445 

35 Arg Arg Ala Gin Ala Gly Ala Leu Glu Arg Glu Gin Arg Ala Ala Asp 
450 455 460 

Arg Ala Ala Gly Gly Gly Ala Gly Arg Pro Ala Glu Ala Asp Leu Leu 
465 470 475 480 

Arg Ala Asp Tyr Asp lie lie Asp Val Ser Lys Ser Met Asp Asp Asp 
40 485 490 495 

Thr Tyr Val Ala Asn Ser Phe Gin His Gin Tyr lie Pro Ala Tyr Gly 

500 505 510 

Gin Asp Leu Glu Arg Leu Ser Arg Leu Trp Glu His Glu Leu Val Arg 
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515 520 525 

Cys Phe Lys lie Leu Arg His Arg Asn Asn Gin Gly Gin Glu Thr Ser 

530 535.. 540 

lie Ser Tyr Ser Ser Gly Ala lie Ala Ser Phe Val Ala Pro Tyr Phe 
5 545 550 555 560 

Glu Tyr Val Leu Arg Ala Pro Arg Ala Gly Ala Leu lie Thr Gly Ser 

565 570 575 

Asp Val He Leu Gly Glu Glu Glu Leu Trp Glu Ala Val Phe Lys Lys 
580 585 590 

10 Thr Arg Leu Gin Thr Tyr Leu Thr Asp Val Ala Ala Leu Phe Val Ala 
595 600 605 

Asp Val Gin His Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr Pro Ala 

610 615 620 

Asp Phe Arg Ala Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr Arg Thr 
15 625 630 635 640 

Arg Ser Arg Ser Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp Gin Gly 

645 650 655 

Trp Gly Val Glu Arg Arg Asp Gly Arg Pro His Ala Arg Arg 
660 665 670 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 2 93: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 710 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 293: 



Met Asp Val Lys Phe Lys Asn Ala Ser Ser Leu Asn Arg Thr Ala Gly 
15 10 15 

35 Leu Ala Pro Gly Cys Cys Gly Gly Gly Pro Gly Ala Arg Thr Ser Arg 
20 25 30 

Glu Pro Ser Pro Pro Asp Ala Ala Met Ala Ala Gin Arg Ala Arg Ala 

35 40 45 

Pro Ala Met Arg Thr Arg Gly Gly Asp Ala Ala Leu Cys Ala Pro Glu 
40 50 55 60 

Asp Gly Trp Val Lys Val His Pro Thr Pro Gly Thr Met Leu Phe Arg 
65 70 75 80 

Glu He Leu Leu Gly Gin Met Gly Tyr Thr Glu Gly Gin Gly Val Tyr 
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85 90 95 

Asn Val Val Arg Ser Ser Glu Ala Ala Thr Arg Gin Leu Gin Ala Ala 

100 105 110 . 

lie Phe His Ala Leu Leu Asn Ala Thr Tyr Asp Leu Glu Glu Asp Trp 
5 115 120 125 

Arg Arg His Val Val Arg Leu Gin Pro Gin Arg Leu Val Arg Arg Tyr 

130 135 140 

Arg Asn Ala Arg Glu Gly Asp He Ala Gly Val Ala Glu Arg Val Phe 
145 150 155 160 

10 Asp Thr Trp Arg Cys Thr Leu Arg Thr Thr Leu Leu Asp Phe Ala His 

165 170 175 

Gly Val Val Asp Cys Phe Ala Pro Gly Gly Pro Ser Gly Pro Thr Ser 

180 185 190 

Phe Pro Lys Tyr He Asp Trp Leu Thr Cys Leu Gly Leu Val Pro He 
15 195 200 205 

Leu Arg Lys Thr Arg Glu Gly Glu Ala Thr Gin Arg Leu Gly Ala Phe 

210 215 220 

Leu Arg Gin His Thr Leu Pro Arg Gin Leu Ala Thr Val Ala Gly Ala 
225 230 235 240 

20 Ala Glu Arg Ala Gly Pro Gly Leu Leu Glu Leu Ala Val Ala Phe Asp 

245 250 255 

Ser Thr Arg Met Ala Glu Tyr Asp Arg Val His He Tyr Tyr Asn His 

260 265 270 

Arg Arg Gly Glu Trp Leu Val Arg Asp Pro Val Ser Gly Gin Arg Gly 
25 275 280 285 

Glu Cys Leu Val Leu Cys Pro Pro Leu Trp Thr Gly Asp Arg Leu Val 

290 295 300 

Phe Asp Ser Pro Val Gin Arg Leu Cys Pro Glu He Val Ala Cys His 
305 310 315 320 

30 Ala Leu Arg Glu His Ala His He Cys Arg Leu Arg Asn Thr Ala Ser 

325 330 335 

Val Lys Val Leu Leu Gly Arg Lys Ser Asp Ser Gly Val Ala Gly Ala 

340 345 350 

Ala Arg Val Val Asn Lys Ala Leu Gly Glu Asp Asp Glu Thr Lys Ala 
35 355 360 365 

Gly Ser Ala Ala Ser Arg Leu Val Arg Leu He He Asn Met Lys Gly 

370 375 380 

Met Arg His Val Gly Asp lie Asn Asp Thr Val Arg Ala Tyr Leu Asp 
385 390 395 400 

40 Glu Ala Gly Gly His Leu He Asp Thr Pro Ala Val Asp His Thr Leu 

405 410 415 

Pro Gly Phe Gly Lys Gly Gly Thr Gly Arg Gly Ser Ala Ala Gin Asp 
420 425 430 
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Pro Gly Ala Arg Pro Gin Gin Leu Arg Gin Ala Phe Gin Thr Ala Val 

435 440 445 

Val Asn Asn lie Asn Gly Met_ Leu Glu Gly Tyr lie Asn Asn Leu Phe 
450 455 460 

5 Gly Thr lie Glu Arg Leu Arg Glu Thr Asn Ala Gly Leu Ala Thr Gin 
465 470 475 480 

Leu Gin Ala Arg Asp Arg Glu Leu Arg Arg Ala Gin Ala Gly Ala Leu 

485 490 495 

Glu Arg Glu Gin Arg Ala Ala Asp Arg Ala Ala Gly Gly Gly Ala Gly 
10 500 505 510 

Arg Pro Ala Glu Ala Asp Leu Leu Arg Ala Asp Tyr Asp lie He Asp 

515 520 525 

Val Ser Lys Ser Met Asp Asp Asp Thr Tyr Val Ala Asn Ser Phe Gin 
530 535 540 

15 His Gin Tyr He Pro Ala Tyr Gly Gin Asp Leu Glu Arg Leu Ser Arg 
545 550 555 560 

Leu Trp Glu His Glu Leu Val Arg Cys Phe Lys He Leu Arg His Arg 

565 570 575 

Asn Asn Gin Gly Gin Glu Thr Ser He Ser Tyr Ser Ser Gly Ala He 
20 580 585 590 

Ala Ser Phe Val Ala Pro Tyr Phe Glu Tyr Val Leu Arg Ala Pro Arg 

595 600 605 

Ala Gly Ala Leu He Thr Gly Ser Asp Val He Leu Gly Glu Glu Glu 
610 615 620 

25 Leu Trp Glu Ala Val Phe Lys Lys Thr Arg Leu Gin Thr Tyr Leu Thr 
625 630 635 640 

Asp Val Ala Ala Leu Phe Val Ala Asp Val Gin His Ala Ala Leu Pro 

645 650 655 

Arg Pro Pro Ser Pro Thr Pro Ala Asp Phe Arg Ala Ser Asp Arg Gly 
30 660 665 670 

Gly Ser Arg Ser Arg Thr Arg Thr Arg Ser Arg Ser Pro Gly Arg Thr 

675 680 685 

Pro Arg Gly Ala Pro Asp Gin Gly Trp Gly Val Glu Arg Arg Asp Gly 
690 695 700 

35 Arg Pro His Ala Arg Arg 
705 710 



(2) INFORMATION FOR SEQ ID NO: 294: 



40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 720 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 294: 

Met Arg Ala Met lie Gly Trp Thr Pro Cys Met Asp Val Lys Phe Lys 

1 5 10 15 

Asn Ala Ser Ser Leu Asn Arg Thr Ala Gly Leu Ala Pro Gly Cys Cys 
10 20 25 30 

Gly Gly Gly Pro Gly Ala Arg Thr Ser Arg Glu Pro Ser Pro Pro Asp 

35 40 45 

Ala Ala Met Ala Ala Gin Arg Ala Arg Ala Pro Ala Met Arg Thr Arg 
50 55 60 

15 Gly Gly Asp Ala Ala Leu Cys Ala Pro Glu Asp Gly Trp Val Lys Val 
65 70 75 80 

His Pro Thr Pro Gly Thr Met Leu Phe Arg Glu lie Leu Leu Gly Gin 

85 90 95 

Met Gly Tyr Thr Glu Gly Gin Gly Val Tyr Asn Val Val Arg Ser Ser 
20 100 105 110 

Glu Ala Ala Thr Arg Gin Leu Gin Ala Ala lie Phe His Ala Leu Leu 

115 120 125 

Asn Ala Thr Tyr Asp Leu Glu Glu Asp Trp Arg Arg His Val Val Arg 
130 135 140 

25 Leu Gin Pro Gin Arg Leu Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly 
145 150 155 160 

Asp lie Ala Gly Val Ala Glu Arg Val Phe Asp Thr Trp Arg Cys Thr 

165 170 175 

Leu Arg Thr Thr Leu Leu Asp Phe Ala His Gly Val Val Asp Cys Phe 
30 180 185 190 

Ala Pro Gly Gly Pro Ser Gly Pro Thr Ser Phe Pro Lys Tyr lie Asp 

195 200 205 

Trp Leu Thr Cys Leu Gly Leu Val Pro lie Leu Arg Lys Thr Arg Glu 
210 215 220 

35 Gly Glu Ala Thr Gin Arg Leu Gly Ala Phe Leu Arg Gin His Thr Leu 
225 230 235 240 

Pro Arg Gin Leu Ala Thr Val Ala Gly Ala Ala Glu Arg Ala Gly Pro 

245 250 255 

Gly Leu Leu Glu Leu Ala Val Ala Phe Asp Ser Thr Arg Met Ala Glu 
40 260 265 270 

Tyr Asp Arg Val His lie Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu 

275 280 285 

Val Arg Asp Pro Val Ser Gly Gin Arg Gly Glu Cys Leu Val Leu Cys 
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290 295 300 

Pro Pro Leu Trp Thr Gly Asp Arg Leu Val Phe Asp Ser Pro Val Gin 
305 310 315 320 

Arg Leu Cys Pro Glu He Val Ala Cys His Ala Leu Arg Glu His Ala 
5 325 330 335 

His He Cys Arg Leu Arg Asn Thr Ala Ser Val Lys Val Leu Leu Gly 

340 345 350 

Arg Lys Ser Asp Ser Gly Val Ala Gly Ala Ala Arg Val Val Asn Lys 
355 360 365 

10 Ala Leu Gly Glu Asp Asp Glu Thr Lys Ala Gly Ser Ala Ala Ser Arg 
370. 375 380 

Leu Val Arg Leu He He Asn Met Lys. Gly Met Arg His Val Gly Asp 
385 390 395 400 

He Asn Asp Thr Val Arg Ala Tyr Leu Asp Glu Ala Gly Gly His Leu 
15 405 410 415 

He Asp Thr Pro Ala Val Asp His Thr Leu Pro Gly Phe Gly Lys Gly 

420 425 430 

Gly Thr Gly Arg Gly Ser Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin 
435 440 . 445 

20 Gin Leu Arg Gin Ala Phe Gin Thr Ala Val Val Asn Asn He Asn Gly 
450 455 460 

Met Leu Glu Gly Tyr He Asn Asn Leu Phe Gly Thr He Glu Arg Leu 
465 470 475 480 

Arg Glu Thr Asn Ala Gly Leu Ala Thr Gin Leu Gin Ala Arg Asp Arg 
25 485 490 495 

Glu Leu Arg Arg Ala Gin Ala Gly Ala Leu Glu Arg Glu Gin Arg Ala 

500 505 510 

Ala Asp Arg Ala Ala Gly Gly Gly Ala Gly Arg Pro Ala Glu Ala Asp 
515 520 525 

30 Leu Leu Arg Ala Asp Tyr Asp He He Asp Val Ser Lys Ser Met Asp 
530 535 540 

Asp Asp Thr Tyr Val Ala Asn Ser Phe Gin His Gin Tyr He Pro Ala 
545 550 555 560 

Tyr Gly Gin Asp Leu Glu Arg Leu Ser Arg Leu Trp Glu His Glu Leu 
35 565 570 575 

Val Arg Cys Phe Lys lie Leu Arg His Arg Asn Asn Gin Gly Gin Glu 

580 585 590 

Thr Ser He Ser Tyr Ser Ser Gly Ala He Ala Ser Phe Val Ala Pro 
595 600 605 

40 Tyr Phe Glu Tyr Val Leu Arg Ala Pro Arg Ala Gly Ala Leu He Thr 
610 615 620 

Gly Ser Asp Val He Leu Gly Glu Glu Glu Leu Trp Glu Ala Val Phe 
625 630 635. . 640 
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Lys Lys Thr Arg Leu Gin Thr Tyr Leu Thr Asp Val Ala Ala Leu Phe 

645 650 655 

Val Ala Asp Val Gin His Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr 
660 665 670 

5 Pro Ala Asp Phe Arg Ala Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr 
675 680 685 

Arg Thr Arg Ser Arg Ser Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp 

690 695 700 

Gin Gly Trp Gly Val Glu Arg Arg Asp Gly Arg Pro His Ala Arg Arg 
10 705 710 715 720 

(2) INFORMATION FOR SEQ ID NO: 295: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 763 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 295: 

Met Arg Tyr Ala Ala Asn Gly Asn Ser Arg Ser Gly Arg Pro Val Gly 
25 1 5 10 15 

Thr Ser Lys Ala Ala Thr Ser Arg Asn His Cys Arg Arg Gly Thr Cys 

20 25 30 

Val Thr Ser Ser Cys Cys Cys Glu Ser Ser Arg Met Arg Ala Met He 
35 , 40 45 

30 Gly Trp Thr Pro Cys Met Asp Val Lys Phe Lys Asn Ala Ser Ser Leu 
50 55 60 

Asn Arg Thr Ala Gly Leu Ala Pro Gly Cys Cys Gly Gly Gly Pro Gly 
65 70 75 80 

Ala Arg Thr Ser Arg Glu Pro Ser Pro Pro Asp Ala Ala Met Ala Ala 
35 85 90 95 

Gin Arg Ala Arg Ala Pro Ala Met Arg Thr Arg Gly Gly Asp Ala Ala 

100 105 110 

Leu Cys Ala Pro Glu Asp Gly Trp Val Lys Val His Pro Thr Pro Gly 
115 120 125 

40 Thr Met Leu Phe Arg Glu He Leu Leu Gly Gin Met Gly Tyr Thr Glu 
130 135 140 

Gly Gin Gly Val Tyr Asn Val Val Arg Ser Ser Glu Ala Ala Thr Arg 
145 150 155 160 
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Gin Leu Gin Ala Ala lie Phe. His Ala Leu Leu Asn Ala Thr Tyr Asp 

165 170 175 

Leu Glu Glu Asp Trp Arg Arg His Val Val Arg Leu Gin Pro Gin Arg 
180 185 190 

5 Leu Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly Asp lie Ala Gly Val 
195 200 205 

Ala Glu Arg Val Phe Asp Thr Trp Arg Cys Thr Leu Arg Thr Thr Leu 

210 215 220 

Leu Asp Phe Ala His Gly Val Val Asp Cys Phe Ala Pro Gly Gly Pro 
10 225 230 235 240 

Ser Gly Pro Thr Ser Phe Pro Lys Tyr lie Asp Trp Leu Thr Cys Leu 

245 250 255 

Gly Leu Val Pro lie Leu Arg Lys Thr Arg Glu Gly Glu Ala Thr Gin 
260 265 270 

15 Arg Leu Gly Ala Phe Leu Arg Gin His Thr Leu Pro Arg Gin Leu Ala 
275 280 285 

Thr Val Ala Gly Ala Ala Glu Arg Ala Gly Pro Gly Leu Leu Glu Leu 

290 295 300 

Ala Val Ala Phe Asp Ser Thr Arg Met Ala Glu Tyr Asp Arg Val His 
20 305 310 315 320 

lie Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu Val Arg Asp Pro Val 

325 330 335 

Ser Gly Gin Arg Gly Glu Cys Leu Val Leu Cys Pro Pro Leu Trp Thr 
340 345 350 

25 Gly Asp Arg Leu Val Phe Asp Ser Pro Val Gin Arg Leu Cys Pro Glu 
355 360 365 

lie Val Ala Cys His Ala Leu Arg Glu His Ala His He Cys Arg Leu 

' 370 375 380 

Arg Asn Thr Ala Ser Val Lys Val Leu Leu Gly Arg Lys Ser Asp Ser 
30 385 390 395 400 

Gly Val Ala Gly Ala Ala Arg Val Val Asn Lys Ala Leu Gly Glu Asp 

405 410 415 

Asp Glu Thr Lys Ala Gly Ser Ala Ala Ser Arg Leu Val Arg Leu He 
420 425 430 

35 He Asn Met Lys Gly Met Arg His Val Gly Asp He Asn Asp Thr Val 
435 440 445 

Arg Ala Tyr Leu Asp Glu Ala Gly Gly His Leu lie Asp Thr Pro Ala 

450 455 460 

Val Asp His Thr Leu Pro Gly Phe Gly Lys Gly Gly Thr Gly Arg Gly 
40 465 470 475 480 

Ser Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin Gin Leu Arg Gin Ala 

485 490 495 

Phe Gin Thr Ala Val Val Asn Asn lie Asn Gly Met Leu Glu Gly Tyr 
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500 505 510 

He Asn Asn Leu Phe Gly Thr lie Glu Arg Leu Arg Glu Thr Asn Ala 

515 520 525 

Gly Leu Ala Thr Gin Leu Gin Ala Arg Asp Arg Glu Leu Arg Arg Ala 
5 530 535 540 

Gin Ala Gly Ala Leu Glu Arg Glu Gin Arg Ala Ala Asp Arg Ala Ala 
545 550 555 560 

Gly Gly Gly Ala Gly Arg Pro Ala Glu Ala. Asp Leu Leu Arg Ala Asp 
565 570 575 

10 Tyr Asp He He Asp Val Ser Lys Ser Met Asp Asp Asp Thr Tyr Val 
580 585 590 

Ala Asn Ser Phe Gin His Gin Tyr He Pro Ala Tyr Gly Gin Asp Leu 

595 600 605 

Glu Arg Leu Ser Arg Leu Trp Glu His Glu Leu Val Arg Cys Phe Lys 
15 610 615 620 

He Leu Arg His Arg Asn Asn Gin Gly Gin Glu Thr Ser He Ser Tyr 
625 630 635 640 

Ser Ser Gly Ala He Ala Ser Phe Val Ala Pro Tyr Phe Glu Tyr Val 
645 650 655 

20 Leu Arg Ala Pro Arg Ala Gly Ala Leu He Thr Gly Ser Asp Val He 
660 665 670 

Leu Gly Glu Glu Glu Leu Trp Glu Ala Val Phe Lys Lys Thr Arg Leu 

675 680 685 

Gin Thr Tyr Leu Thr Asp Val Ala Ala Leu Phe Val Ala Asp Val Gin 
25 690 695 700 

His Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr Pro Ala Asp Phe Arg 
705 710 715 720 

Ala Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr Arg Thr Arg Ser Arg 
725 730 735 

30 Ser Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp Gin Gly Trp Gly Val 
740 745 750 

Glu Arg Arg Asp Gly Arg Pro His Ala Arg Arg 
755 760 

35 (2) INFORMATION FOR SEQ ID NO: 2 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 296: 

Met Ala Asp lie Pro Pro Asp Pro Pro Ala Leu Asn Thr Thr Pro Ala 
5 1 5 10 15 

Asn His Ala Pro Pro Ser Pro Pro Pro Gly Ser Arg Lys Arg Arg Arg 

20 25 30 

Pro Val Leu Pro Ser Ser Ser Glu Ser Glu Gly Lys Pro Asp Thr Glu 
35 40 45 

10 Ser Glu Ser Ser Ser Thr Glu Ser Ser Glu Asp Glu Ala Gly Asp Leu 
50 55 60 

Arg Gly Gly Arg Arg Arg Ser Pro Arg Glu Leu Gly Gly Arg Tyr Phe 
65 70 75 80 

Leu Asp Leu Ser Ala Glu Ser Thr Thr Gly Thr Glu Ser Glu Gly Thr 
15 85 90 95 

Gly Pro Ser Asp Asp Asp Asp Asp Asp Ala Ser Asp Gly Trp Leu Val 

100 105 110 

Asp Thr Pro Pro Arg Lys Ser Lys Arg Pro Arg lie Asn Leu Arg Leu 
115 120 125 

20 Thr Ser Ser Pro Asp Arg Arg Ala Gly Val Val Phe Pro Glu Val Trp 
130 135 140 

Arg Asn Asp Arg Pro lie Arg Ala Ala Gin Pro Gin Ala Pro Ala Gin 
145 150 155 160 

Ser Ser Gly Asp Arg Ala Ala Ala Pro Arg Arg Ser Ala Arg Gin Ala 
25 165 170 175 

Gin Met Arg Ser Gly Ala Ala Trp Thr Leu Asp Leu His Tyr lie Arg 

180 185 190 

Gin Cys Val Asn Gin Leu Phe Arg lie Leu Arg Ala Ala Pro Asn Pro 
195 200 205 

30 Pro Gly Ser Ala Asn Arg Leu Arg His Leu Val Arg Asp Cys Tyr Leu 
210 215 220 

Met Gly Tyr Cys Arg Thr Arg Leu Gly Pro Arg Thr Trp Gly Arg Leu 
225 230 235 240 

Leu Gin lie Ser Gly Gly Thr Trp Asp Val Arg Leu Arg Asn Ala lie 
35 245 250 255 

Arg Glu Val Glu Ala Arg Phe Glu Pro Ala Ala Glu Pro Val Cys Glu 

260 265 270 

Leu Pro Cys Leu Asn Ala Arg Arg Tyr Gly Pro Glu Cys Asp Val Gly 
275 280 285 

40 Asn Leu Glu Thr Asn Gly Gly Ser Thr Ser Asp Asp Glu lie Ser Asp 
290 295 300 

Ala Thr Asp Ser Asp Asp Thr Leu Ala Ser His Ser Asp Thr Glu Gly 
305 310 315 320 
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Gly Pro Ser Pro Ala Gly Arg Glu Asn Pro Glu Ser Ala Ser Gly Gly 

325 330 335 

Ala lie Ala Ala Arg Leu Glu-. Cys Glu Phe Gly Thr Phe Asp Trp Thr 

340 345 350 

5 Ser Glu Glu Gly Ser Gin Pro Trp Leu Ser Ala Val Val Ala Asp Thr 

355 360 365 

Ser Ser Ala Glu Arg Ser Gly Leu Pro Ala Pro Gly Ala Cys Arg Ala 

370 375 380 

Thr Glu Ala Pro Glu Arg Glu Asp Gly Cys Arg Lys Met Arg Phe Pro 

10 385 390 395 400 

Ala Ala Cys Pro Tyr Pro Cys Gly His Thr Phe Leu Arg Pro 

405 410 



15 



(2) INFORMATION FOR SEQ ID NO: 297: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:297: 

25 

Met Gly Arg Leu Arg Asn Ala Pro Glu Ser Leu Thr Tyr Met Phe Cys 

15 10 15 

Ala Ala He Arg Val Ala Pro Val Thr Thr Gin Ser Arg Thr Ser Leu 
20 25 30 

30 Arg Val Cys Thr His Val Leu Phe Pro Asp Pro Ala Leu Pro Val Met 
35 40 45 

Arg Tyr Ala Ala Asn Gly Asn Ser Arg Ser Gly Arg Pro Val Gly Thr 

50 55 60 

Ser Lys Ala Ala Thr Ser Arg Asn His Cys Arg Arg Gly Thr Cys Val 
35 65 70 75 80 

Thr Ser Ser Cys Cys Cys Glu Ser Ser Arg Met Arg Ala Met He Gly 

85 90 95 

Trp Thr Pro Cys Met Asp Val Lys Phe Lys Asn Ala Ser Ser Leu Asn 
100 105 110 

40 Arg Thr Ala Gly Leu Ala Pro Gly Cys Cys Gly Gly Gly Pro Gly Ala 
115 120 125 

Arg Thr Ser Arg Glu Pro Ser Pro Pro Asp Ala Ala Met Ala Ala Gin 
130 135 140 
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Arg Ala Arg Ala Pro Ala Met Arg Thr Arg Gly Gly Asp Ala Ala Leu 
145 150 155 160 

Cys Ala Pro Glu Asp Gly Trp .Val Lys Val His Pro Thr Pro Gly Thr 
165 170 175 

5 Met Leu Phe Arg Glu lie Leu Leu Gly Gin Met Gly Tyr Thr Glu Gly 
180 185 190 

Gin Gly Val Tyr Asn Val Val Arg Ser Ser Glu Ala Ala Thr Arg Gin 

195 200 205 

Leu Gin Ala Ala lie Phe His Ala Leu Leu Asn Ala Thr Tyr Asp Leu 
10 210 215 220 

Glu Glu Asp Trp Arg Arg His Val Val Arg Leu Gin Pro Gin Arg Leu 
225 230 235 240 

Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly Asp lie Ala Gly Val Ala 
245 250 255 

15 Glu Arg Val Phe Asp Thr Trp Arg Cys Thr Leu Arg Thr Thr Leu Leu 
260 265 270 

Asp Phe Ala His Gly Val Val Asp Cys Phe Ala Pro Gly Gly Pro Ser 

275 . 280 285 

Gly Pro Thr Ser Phe Pro Lys Tyr lie Asp Trp Leu Thr Cys Leu Gly 
20 290 295 300 

Leu Val Pro lie Leu Arg Lys Thr Arg Glu Gly Glu Ala Thr Gin Arg 
305 310 315 320 

Leu Gly Ala Phe Leu Arg Gin His Thr Leu Pro Arg Gin Leu Ala Thr 
325 330 335 

25 Val Ala Gly Ala Ala Glu Arg Ala Gly Pro Gly Leu Leu Glu Leu Ala 
340 345 350 

Val Ala Phe Asp Ser Thr Arg Met Ala Glu Tyr Asp Arg Val His lie 

355 360 365 

Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu Val Arg Asp Pro Val Ser 
30 370 375 380 

Gly Gin Arg Gly Glu Cys Leu Val Leu Cys Pro Pro Leu Trp Thr Gly 
385 390 395 400 

Asp Arg Leu Val Phe Asp Ser Pro Val Gin Arg Leu Cys Pro Glu lie 
405 410 415 

35 Val Ala Cys His Ala Leu Arg Glu His Ala His He Cys Arg Leu Arg 
420 425 430 

Asn Thr Ala Ser Val Lys Val Leu Leu Gly Arg Lys Ser Asp Ser Gly 

435 440 445 

Val Ala Gly Ala Ala Arg Val Val Asn Lys Ala Leu Gly Glu Asp Asp 
40 450 455 460 

Glu Thr Lys Ala Gly Ser Ala Ala Ser Arg Leu Val Arg Leu He He 
465 470 475 480 

Asn Met Lys Gly Met Arg His Val Gly Asp He Asn Asp Thr Val Arg 
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485 



490 



495 



Ala Tyr Leu Asp Glu Ala Gly Gly His Leu lie Asp Thr Pro Ala Val 

500 - 505 510 

Asp His Thr Leu Pro Gly Phe Gly Lys Gly Gly Thr Gly Arg Gly Ser 

515 520 525 

Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin Gin Leu Arg Gin Ala Phe 

530 535 540 

Gin Thr Ala Val Val Asn Asn lie Asn Gly Met Leu Glu Gly Tyr lie 



10 Asn Asn Leu Phe Gly Thr lie Glu Arg Leu Arg Glu Thr Asn Ala Gly 

565 570 575 

Leu Ala Thr Gin Leu Gin Ala Arg Asp Arg Glu Leu Arg Arg Ala Gin 

580 585 590 

Ala Gly Ala Leu Glu Arg Glu Gin Arg Ala Ala Asp Arg Ala Ala Gly 
15 595 600 605 

Gly Gly Ala Gly Arg Pro Ala Glu Ala Asp Leu Leu Arg Ala Asp Tyr 

610 615 620 

Asp lie lie Asp Val Ser Lys Ser Met Asp Asp Asp Thr Tyr Val Ala 
625 630 635 640 

20 Asn Ser Phe Gin His Gin Tyr He Pro Ala Tyr Gly Gin Asp Leu Glu 

645 650 655 

Arg Leu Ser Arg Leu Trp Glu His Glu Leu Val Arg Cys Phe Lys He 

660 665 670 

Leu Arg His Arg Asn Asn Gin Gly Gin Glu Thr Ser He Ser Tyr Ser 
25 675 680 685 

Ser Gly Ala He Ala Ser Phe Val Ala Pro Tyr Phe Glu Tyr Val Leu 

690 695 700 

Arg Ala Pro Arg Ala Gly Ala Leu lie Thr Gly Ser Asp Val He Leu 
705 710 715 720 

30 Gly Glu Glu Glu Leu Trp Glu Ala Val Phe Lys Lys Thr Arg Leu Gin 

725 730 735 

Thr Tyr Leu Thr Asp Val Ala Ala Leu Phe Val Ala Asp Val Gin His 

740 745 750 

Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr Pro Ala Asp Phe Arg Ala 
35 755 760 765 

Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr Arg Thr Arg Ser Arg Ser 

770 775 780 

Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp Gin Gly Trp Gly Val Glu 
785 790 795 800 

40 Arg Arg Asp Gly Arg Pro His Ala Arg Arg 



545 



550 



555 



560 



805 



810 



(2) INFORMATION FOR SEQ ID NO: 298: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298: 

Met Val Leu Met Gly Arg Leu Arg Asn Ala Pro Glu Ser Leu Thr Tyr 

1 5 10 15 

Met Phe Cys Ala Ala He Arg Val Ala Pro Val Thr Thr Gin Ser Arg 
15 20 25 30 

Thr Ser Leu Arg Val Cys Thr His Val Leu Phe Pro Asp Pro Ala Leu 

35 40 45 

Pro Val Met Arg Tyr Ala Ala Asn Gly Asn Ser Arg Ser Gly Arg Pro 
50 55 60 

20 Val Gly Thr Ser Lys Ala Ala Thr Ser Arg Asn His Cys Arg Arg Gly 
65 70 75 80 

Thr Cys Val Thr Ser Ser Cys Cys Cys Glu Ser Ser Arg Met Arg Ala 

85 90 95 

Met He Gly Trp Thr Pro Cys Met Asp Val Lys Phe Lys Asn Ala Ser 
25 100 105 110 

Ser Leu Asn Arg Thr Ala Gly Leu Ala Pro Gly Cys Cys Gly Gly Gly 

115 120 125 

Pro Gly Ala Arg Thr Ser Arg Glu Pro Ser Pro Pro Asp Ala Ala Met 
130 135 140 

30 Ala Ala Gin Arg Ala Arg Ala Pro Ala Met Arg Thr Arg Gly Gly Asp 
145 150 155 160 

Ala Ala Leu Cys Ala Pro Glu Asp Gly Trp Val Lys Val His Pro Thr 

165 170 175 

Pro Gly Thr Met Leu Phe Arg Glu He Leu Leu Gly Gin Met Gly Tyr 
35 180 185 190 

Thr Glu Gly Gin Gly Val Tyr Asn Val Val Arg Ser Ser Glu Ala Ala 

195 200 205 

Thr Arg Gin Leu Gin Ala Ala He Phe His Ala Leu Leu Asn Ala Thr 
210 215 220 

40 Tyr Asp Leu Glu Glu Asp Trp Arg Arg His Val Val Arg Leu Gin Pro 
225 230 235 240 

Gin Arg Leu Val Arg Arg Tyr Arg Asn Ala Arg Glu Gly Asp lie Ala 
245 250 255 
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Gly Val Ala Glu Arg Val Phe Asp Thr Trp Arg Cys Thr Leu Arg Thr 

260 265 270 

Thr Leu Leu Asp Phe Ala His. Gly Val Val Asp Cys Phe Ala Pro Gly 
275 280 285 

5 Gly Pro Ser Gly Pro Thr Ser Phe Pro Lys Tyr He Asp Trp Leu Thr 
290 295 300 

Cys Leu Gly Leu Val Pro He Leu Arg Lys Thr Arg Glu Gly Glu Ala 
305 310 315 320 

Thr Gin Arg Leu Gly Ala Phe Leu Arg Gin His Thr Leu Pro Arg Gin 
10 325 330 335 

Leu Ala Thr Val Ala Gly Ala Ala Glu Arg Ala Gly Pro Gly Leu Leu 

340 345 350 

Glu Leu Ala Val Ala Phe Asp Ser Thr Arg Met Ala Glu Tyr Asp Arg 
355 360 365 

15 Val His He Tyr Tyr Asn His Arg Arg Gly Glu Trp Leu Val Arg Asp 
370 375 380 

Pro Val Ser Gly Gin Arg Gly Glu Cys Leu Val Leu Cys Pro Pro Leu 
385 390 395 400 

Trp Thr Gly Asp Arg Leu Val Phe Asp Ser Pro Val Gin Arg Leu Cys 
20 405 410 415 

Pro Glu He Val Ala Cys His Ala Leu Arg Glu His Ala His lie Cys 

420 425 430 

Arg Leu Arg Asn Thr Ala Ser Val Lys Val Leu Leu Gly Arg Lys Ser 
435 440 445 

25 Asp Ser Gly Val Ala Gly Ala Ala Arg Val Val Asn Lys Ala Leu Gly 
450 . 455 460 

Glu Asp Asp Glu Thr Lys Ala Gly Ser Ala Ala Ser Arg Leu Val Arg 
.465 470 475 480 

Leu He He Asn Met Lys Gly Met Arg His Val Gly Asp He Asn Asp 
30 485 490 495 

Thr Val Arg Ala Tyr Leu Asp Glu Ala Gly Gly His Leu He Asp Thr 

500 505 510 

Pro Ala Val Asp His Thr Leu Pro Gly Phe Gly Lys Gly Gly Thr Gly 
515 .520 525 

35 Arg Gly Ser Ala Ala Gin Asp Pro Gly Ala Arg Pro Gin Gin Leu Arg 
530 535 540 

Gin Ala Phe Gin Thr Ala Val Val Asn Asn He Asn Gly Met Leu Glu 
545 550 555 560 

Gly Tyr He Asn Asn Leu Phe Gly Thr He Glu Arg Leu Arg Glu Thr 
40 565 570 575 

Asn Ala Gly Leu Ala Thr Gin Leu Gin Ala Arg Asp Arg Glu Leu Arg 

580 585 590 

Arg Ala Gin Ala Gly Ala Leu Glu Arg Glu Gin Arg Ala Ala Asp Arg 
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595 600 605 

Ala Ala Gly Gly Gly Ala Gly Arg Pro Ala Glu Ala Asp Leu Leu Arg 

610 615 620 

Ala Asp Tyr Asp lie lie Asp Val Ser Lys Ser Met Asp Asp Asp Thr 
5 625 630 635 640 

Tyr Val Ala Asn Ser Phe Gin His Gin Tyr lie Pro Ala Tyr Gly Gin 

645 650 655 

Asp Leu Glu Arg Leu Ser Arg Leu Trp Glu His Glu Leu Val Arg Cys 
660 665 670 

10 Phe Lys He Leu Arg His Arg Asn Asn Gin Gly Gin Glu Thr Ser He 
675 680 685 

Ser Tyr Ser Ser Gly Ala He Ala Ser Phe Val Ala Pro Tyr Phe Glu 

690 695 700 

Tyr Val Leu Arg Ala Pro Arg Ala Gly Ala Leu He Thr Gly Ser Asp 
15 705 710 715 720 

Val He Leu Gly Glu Glu Glu Leu Trp Glu Ala Val Phe Lys Lys Thr 

725 730 735 

Arg Leu Gin Thr Tyr Leu Thr Asp Val Ala Ala Leu Phe Val Ala Asp 
740 745 750 

20 Val Gin His Ala Ala Leu Pro Arg Pro Pro Ser Pro Thr Pro Ala Asp 
755 760 765 

Phe Arg Ala Ser Asp Arg Gly Gly Ser Arg Ser Arg Thr Arg Thr Arg 

770 775 780 

Ser Arg Ser Pro Gly Arg Thr Pro Arg Gly Ala Pro Asp Gin Gly Trp 
25 785 790 795 800 

Gly Val Glu Arg Arg Asp Gly Arg Pro His Ala Arg Arg 
805 810 



(2) INFORMATION FOR SEQ ID NO: 299: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 

Met Ala Leu Gly Arg Val Gly Leu Ala Val Gly Leu Trp Gly Leu Leu 

15 10 15 

Trp Val Gly Val Val Val Val Leu Ala Asn Asp Gly Arg Thr He Thr 
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20 25 30 

Val Gly Pro Arg Gly Asn Asn Ala Ala Pro Ser Asp Arg Asn Ala Ser 

35 -40 45 

Ala Pro Arg Thr Thr Pro Thr Pro Pro Gin Pro Arg Lys Ala Thr Lys 
5 50 55 60 

Ser Lys Ala Ser Thr Ala Lys Pro Ala Pro Pro Pro Lys Thr Gly Pro 
65 70 75 80 

Pro Lys Thr Ser Ser Glu Pro Val Arg Cys Asn Arg His Asp Pro Leu 
85 90 95 

10 Ala Arg Tyr Gly Ser Arg Val Gin lie Arg Cys Arg Phe Pro Asn Ser 
100 105 110 

Thr Arg Thr Glu Ser Arg Leu Gin lie Trp Arg Tyr Ala Thr Ala Thr 

115 120 125 

Asp Ala Glu He Gly Thr Ala Pro Ser Leu Glu Glu Val Met Val Asn 
15 130 135 140 

Val Ser Ala Pro Pro Gly Gly Gin Leu Val Tyr Asp Ser Ala Pro Asn 
145 150 155 160 

Arg Thr Asp Pro His Val He Trp Ala Glu Gly Ala Gly Pro Gly Asp 
165 170 175 

20 Arg Lys Val Val Gly Pro Leu Gly Arg Gin Arg Leu He He Glu Glu 
180 185 190 

Leu Thr Leu Glu Thr Gin Gly Met Tyr Tyr Trp Val Trp Gly Arg Thr 

195 200 205 

Asp Arg Pro Ser Ala Tyr Gly Thr Trp Val Arg Val Arg Val Phe Arg 
25 210 215 220 

Pro Pro Ser Leu Thr lie His Pro His Ala Val Leu Glu Gly Gin Pro 
225 230 235 240 

Phe Lys Ala Thr Cys Thr Ala Ala Thr Tyr Tyr Pro Gly Asn Arg Ala 
245 250 255 

30 Glu Phe Val Trp Phe Glu Asp Gly Arg Arg Val Phe Asp Pro Ala Gin 
260 265 270 

He His Thr Gin Thr Gin Glu Asn Pro Asp Gly Phe Ser Thr Val Ser 

275 280 285 

Thr Val Thr Ser Ala Ala Val Gly Gly Gin Gly Pro Pro Arg Thr Phe 
35 290 295 300 

Thr Cys Gin Leu Thr Trp His Arg Asp Ser Val Ser Phe Ser Arg Arg 
305 310 315 320 

Asn Ala Ser Gly Thr Ala Ser Val Leu Pro Arg Pro Thr He Thr Met 
325 330 335 

40 Glu Phe Thr Gly Asp His Ala Val Cys Thr Ala Gly Cys Val Pro Glu 
340 345 350 

Gly Val Thr Phe Ala Trp Phe Leu Gly Asp Asp Ser Ser Pro Ala Glu 
355 360 365 
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Lys Val Ala Val Ala Ser Gin Thr Ser Cys Gly Arg Pro Gly Thr Ala 

370 375 380 

Thr lie Arg Ser Thr Leu Pro . Val Ser Tyr Glu Gin Thr Glu Tyr He 
385 390 395 400 

5 Cys Arg Leu Ala Gly Tyr Pro Asp Gly He Pro Val Leu Glu His His 

405 410 415 

Gly Ser His Gin Pro Pro Pro Arg Asp Pro Thr Glu Arg Gin Val lie 

420 425 430 

Arg Ala Val Glu Gly Ala Gly He Gly Val Ala Val Leu Val Ala Val 
10 435 440 445 

Val Leu Ala Gly Thr Ala Val Val Tyr Leu Thr His Ala Ser Ser Val 

450 455 460 

Arg Tyr Arg Arg Leu Arg 
465 470 

15 

(2) INFORMATION FOR SEQ ID NO: 3 00: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 536 amino acids 

20 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:300: 

Met Gly Ala Gly Val Pro Trp Thr Gly He Lys Arg Ala Gly Gly Pro 
1 5 • 10 . 15 ' 

30 He Thr Val Arg Val Leu Gly Trp Glu Val Ala Gin Lys Ala Thr His 
20 25 30 

Pro Cys Cys Ser Cys Pro Arg Glu Ala Val Val Ser Gly Asn Pro Pro 

35 40 45 

Arg Cys Ala Gly Arg Ala His Arg Ser Phe Ala Gly Ala Gly Ala Leu 
35 50 55 60 

Leu Val Met Ala Leu Gly Arg Val Gly Leu Ala Val. Gly Leu Trp Gly 
65 70 75 80 

Leu Leu Trp Val Gly Val Val Val Val Leu Ala Asn Asp Gly Arg Thr 
85 90 95 

40 He Thr Val Gly Pro Arg Gly Asn Asn Ala Ala Pro Ser Asp Arg Asn 
100 105 HO 

Ala Ser Ala Pro Arg Thr Thr Pro Thr Pro Pro Gin Pro Arg Lys Ala 
115 120 125 
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Thr Lys Ser Lys Ala Ser Thr Ala Lys Pro Ala Pro Pro Pro Lys Thr 

130 135 140 

Gly Pro Pro Lys Thr Ser Ser... Glu Pro Val Arg Cys Asn Arg His Asp 
145 150 155 160 

5 Pro Leu Ala Arg Tyr Gly Ser Arg Val Gin lie Arg Cys Arg Phe Pro 

165 170 175 

Asn Ser Thr Arg Thr Glu Ser Arg Leu Gin lie Trp Arg Tyr Ala Thr 

180 185 190 

Ala Thr Asp Ala Glu He Gly Thr Ala Pro Ser Leu Glu Glu Val Met 
10 195 200 205 

Val Asn Val Ser Ala Pro Pro Gly Gly Gin Leu Val Tyr Asp Ser Ala 

210 215 220 

Pro Asn Arg Thr Asp Pro His Val lie Trp Ala Glu Gly Ala Gly Pro 
225 230 235 240 

15 Gly Asp Arg Lys Val Val Gly Pro Leu Gly Arg Gin Arg Leu He He 

245 250 255 

Glu Glu Leu Thr Leu Glu Thr Gin Gly Met Tyr Tyr Trp Val Trp Gly 

260 265 270 

Arg Thr Asp Arg Pro Ser Ala Tyr Gly Thr Trp Val Arg Val Arg Val 
20 275 280 285 

Phe Arg Pro Pro Ser Leu Thr He His Pro His Ala Val Leu Glu Gly 

290 295 300 

Gin Pro Phe Lys Ala Thr Cys Thr Ala Ala Thr Tyr Tyr Pro Gly Asn 
305 310 315 320 

25 Arg Ala Glu Phe Val Trp Phe Glu Asp Gly Arg Arg Val Phe Asp Pro 

325 330 335 

Ala Gin He His Thr Gin Thr Gin Glu Asn Pro Asp Gly Phe Ser Thr 

340 345 350 

Val Ser Thr Val Thr Ser Ala Ala Val Gly Gly Gin Gly Pro Pro Arg 
30 355 360 365 

Thr Phe Thr Cys Gin Leu Thr Trp His Arg Asp Ser Val Ser Phe Ser 

370 375 380 

Arg Arg Asn Ala Ser Gly Thr Ala Ser Val Leu Pro Arg Pro Thr He 
385 390 395 400 

35 Thr Met Glu Phe Thr Gly Asp His Ala Val Cys Thr Ala Gly Cys Val 

405 410 415 

Pro Glu Gly Val Thr Phe Ala Trp Phe Leu Gly Asp Asp Ser Ser Pro 

420 425 430 

Ala Glu Lys Val Ala Val Ala Ser Gin Thr Ser Cys Gly Arg Pro Gly 
40 435 440 445 

Thr Ala Thr He Arg Ser Thr Leu Pro Val Ser Tyr Glu Gin Thr Glu 

450 455 460 

Tyr He Cys Arg Leu Ala Gly Tyr Pro Asp Gly He Pro Val Leu Glu 
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10 



20 



465 470 475 480 

His His Gly Ser His Gin Pro Pro Pro Arg Asp Pro Thr Glu Arg Gin 

485 , 490 495 

Val lie Arg Ala Val Glu Gly Ala Gly He Gly Val Ala Val Leu Val 

500 505 510 

Ala Val Val Leu Ala Gly Thr Ala Val Val Tyr Leu Thr His Ala Ser 

515 520 525 

Ser Val Arg Tyr Arg Arg Leu Arg 
530 535 

<2) INFORMATION FOR SEQ ID NO: 301: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 545 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 301: 



Met Ser Val Leu Gly Asp Ala Arg His Pro Arg Arg Phe Pro Ser Arg 
15 10 15 

25 Gly Pro Arg Pro Phe Ser Val Ala Gly Pro Gly Ser Leu Pro Pro Ser 
20 25 30 

Pro Pro Pro Gly Ala Arg Ala Arg Leu He Arg Leu Ser Arg Ser Leu 

35 40 45 

Phe Pro Asp Pro Thr Ala Pro Met Asp Leu Leu Val Asp Asp Leu Phe 
30 50 55 60 

Ala Asp Ala Asp Gly Val Ser Pro Pro Pro Pro Arg Pro Ala Gly Gly 
65 70 75 80 

Pro Lys Asn Thr Pro Ala Ala Pro Pro Leu Tyr Ala Thr Gly Arg Leu 
85 90 95 

35 Ser Gin Ala Gin Leu Met Pro Ser Pro Pro Met Pro Val Pro Pro Ala 
100 105 110 

Ala Leu Phe Asn Arg Leu Leu Asp Asp Leu Gly Phe Ser Ala Gly Pro 

115 120 125 

Ala Leu Cys Thr Met Leu Asp Thr Trp Asn Glu Asp Leu Phe Ser Gly 
40 130 135 140 

Phe Pro Thr Asn Ala Asp Met Tyr Arg Glu Cys Lys Phe Leu Ser Thr 
145 150 155 160 

Leu Pro Ser Asp Val He Asp Trp Gly Asp Ala His Val Pro Glu Arg 

734 



WO 98/20016 



PCT/US97/20016 



165 170 175 

Ser Pro He Asp He Arg Ala His Gly Asp Val Ala Phe Pro Thr Leu 

180 . „ 185 190 

Pro Ala Thr Arg Asp Glu Leu Pro Ser Tyr Tyr Glu Ala Met Ala Gin 
5 195. 200 205 

Phe Phe Arg Gly Glu Leu Arg Ala Arg Glu Glu Ser Tyr Arg Thr Val 

210 215 220 

Leu Ala Asn Phe Cys Ser Ala Leu Tyr Arg Tyr Leu Arg Ala Ser Val 
225 230 235 240 

10 Arg Gin Leu His. Arg Gin Ala His Met Arg Gly Arg Asn Arg Asp Leu 

245 250 255 

Arg Glu Met Leu Arg Thr Thr He Ala Asp Arg Tyr Tyr Arg Glu Thr 

260 265 270 

Ala Arg Leu Ala Arg Val Leu Phe Leu His Leu Tyr Leu Phe Leu Ser 
15 275 280 285 

Arg Glu lie Leu Trp Ala Ala Tyr Ala Glu Gin Met Met Arg Pro Asp 

290 295 300 

Leu Phe Asp Gly Leu Cys Cys Asp Leu Glu Ser Trp Arg Gin Leu Ala 
305 310 315 320 

20 Cys Leu Phe Gin Pro Leu Met Phe He Asn Gly Ser Leu Thr Val Arg 

325 330 335 

Gly Val Pro Val Glu Ala Arg Arg Leu Arg Glu Leu Asn His He Arg 

340 345 350 

Glu His Leu Asn Leu Pro Leu Val Arg Ser Ala Ala Ala Glu Glu Pro 
25 355 360 365 

Gly Ala Pro Leu Thr Thr Pro Pro Val Leu Gin Gly Asn Gin Ala Arg 

370 375 380 

Ser Ser Gly Tyr Phe Met Leu Leu He Arg Ala Lys Leu Asp Ser Tyr 
385 390 395 400 

30 Ser Ser Val Ala Thr Ser Glu Gly Glu Ser Val Met Arg Glu His Ala 

405 410 415 

Tyr Ser Arg Gly Arg Thr Arg Asn Asn Tyr Gly Ser Thr He Glu Gly 

420 425 430 

Leu Leu Asp Leu Pro Asp Asp Asp Asp Ala Pro Ala Glu Ala Gly Leu 
35 435 440 445 

Val Ala Pro Arg Met Ser Phe Leu Ser Ala Gly Gin Arg Pro Arg Arg 

450 455 460 

Leu Ser Thr Thr Ala Pro He Thr Asp Val Ser Leu Gly Asp Glu Leu 
465 470 475 480 

40 Arg Leu Asp Gly Glu Glu Val Asp Met Thr Pro Ala Asp Ala Leu Asp 

485 490 495 

Asp Phe Asp Leu Glu Met Leu Gly Asp Val Glu Ser Pro Ser Pro Gly 
500 505 510 
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Met Thr His Asp Pro Val Ser Tyr Gly Ala Leu Asp Val Asp Asp Phe 

515 520 525 

Glu Phe Glu Gin Met Phe Thr Asp Ala Met Gly lie Asp Asp Phe Gly 
530 535 540 

5 Gly 
545 

(2) INFORMATION FOR SEQ ID NO: 3 02: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 490 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 302: 

20 Met Asp Leu Leu Val Asp Asp Leu Phe Ala Asp Ala Asp Gly Val Ser 
15 10 15 

Pro Pro Pro Pro Arg Pro Ala Gly Gly Pro Lys Asn Thr Pro Ala Ala 

20 25 30 

Pro Pro Leu Tyr Ala Thr Gly Arg Leu Ser Gin Ala Gin Leu Met Pro 
25 35 40 45 

Ser Pro Pro Met Pro Val Pro Pro Ala Ala Leu Phe Asn Arg Leu Leu 

50 55 60 

Asp Asp Leu Gly Phe Ser Ala Gly Pro Ala Leu Cys Thr Met Leu Asp 
65 70 75 80 

30 Thr Trp Asn Glu Asp Leu Phe Ser Gly Phe Pro Thr Asn Ala Asp Met 

85 90 95 

Tyr Arg Glu Cys Lys Phe Leu Ser Thr Leu Pro Ser Asp Val lie Asp 

100 105 110 

Trp Gly Asp Ala His Val Pro Glu Arg Ser Pro lie Asp lie Arg Ala 
35 115 120 125 

His Gly Asp Val Ala Phe Pro Thr Leu Pro Ala Thr Arg Asp Glu Leu 

130 135 140 

Pro Ser Tyr Tyr Glu Ala Met Ala Gin Phe Phe Arg Gly Glu Leu Arg 
145 150 155 160 

40 Ala Arg Glu Glu Ser Tyr Arg Thr Val Leu Ala Asn Phe Cys Ser Ala 

165 170 175 

Leu Tyr Arg Tyr Leu Arg Ala Ser Val Arg Gin Leu His Arg Gin Ala 
180 185 190 

736 



WO 98/20016 



PCT/US97/20016 



His Met Arg Gly Arg Asn Arg Asp Leu Arg Glu Met Leu Arg Thr Thr 

195 200 . 205 

lie Ala Asp Arg Tyr Tyr Arg, Glu Thr Ala Arg Leu Ala Arg Val Leu 
210 215 220 

5 Phe Leu His Leu Tyr Leu Phe Leu Ser Arg Glu lie Leu Trp Ala Ala 
225 230 235 240 

Tyr Ala Glu Gin Met Met Arg Pro Asp Leu Phe Asp Gly Leu Cys Cys 

245 250 255 

Asp Leu Glu Ser Trp Arg Gin Leu Ala Cys Leu Phe Gin Pro Leu Met 
10 260 265 270 

Phe He Asn Gly Ser Leu Thr Val Arg Gly Val Pro Val Glu Ala Arg 

275 280 285 

Arg Leu Arg Glu Leu Asn His He Arg Glu His Leu Asn Leu Pro Leu 
290 295 300 

15 Val Arg Ser Ala Ala Ala Glu Glu Pro Gly Ala Pro Leu Thr Thr Pro 
305 310 315 320 

Pro Val Leu Gin Gly Asn Gin Ala Arg Ser Ser Gly Tyr Phe Met Leu 

325 330 335 

Leu He Arg Ala Lys Leu Asp Ser Tyr Ser Ser Val Ala Thr Ser Glu 
20 340 345 350 

Gly Glu Ser Val Met Arg Glu His Ala Tyr Ser Arg Gly Arg Thr Arg 

355 360 365 

Asn Asn Tyr Gly Ser Thr He Glu Gly Leu Leu Asp Leu Pro Asp Asp 
370 375 380 

25 Asp Asp Ala Pro Ala Glu Ala Gly Leu Val Ala Pro Arg Met Ser Phe 
385 390 395 400 

Leu Ser Ala Gly Gin Arg Pro Arg Arg Leu Ser Thr Thr Ala Pro He 

405 410 415 

Thr Asp Val Ser Leu Gly Asp Glu Leu Arg Leu Asp Gly Glu Glu Val 
30 420 425 430 

Asp Met Thr Pro Ala Asp Ala Leu Asp Asp Phe Asp Leu Glu Met Leu 

435 440 . 445 

Gly Asp Val Glu Ser Pro Ser Pro Gly Met Thr His Asp Pro Val Ser 
450 455 460 

35 Tyr Gly Ala Leu Asp Val Asp Asp Phe Glu Phe Glu Gin Met Phe Thr 
465 470 475 480 

Asp Ala Met Gly He Asp Asp Phe Gly Gly 
485 490 

40 {2) INFORMATION FOR SEQ ID NO: 3 03: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 552 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 303: 

Met Arg Gly Gly Gly Arg Glu Met Ser Val Leu Gly Asp Ala Arg His 
10 1 5 10 15 

Pro Arg Arg Phe Pro Ser Arg Gly Pro Arg Pro Phe Ser Val Ala Gly 

20 25 30 

Pro Gly Ser Leu Pro Pro Ser Pro Pro Pro Gly Ala Arg Ala Arg Leu 
35 40 45 

15 lie Arg Leu Ser Arg Ser Leu Phe Pro Asp Pro Thr Ala Pro Met Asp 
50 55 60 

Leu Leu Val Asp Asp Leu Phe Ala Asp Ala. Asp Gly Val Ser Pro Pro 
65 70 75 80 

Pro Pro Arg Pro Ala Gly Gly Pro Lys Asn Thr Pro Ala Ala Pro Pro 
20 85 90 95 

Leu Tyr Ala Thr Gly Arg Leu Ser Gin Ala Gin Leu Met Pro Ser Pro 

100 105 110 

Pro Met Pro Val Pro Pro Ala Ala Leu Phe Asn Arg Leu Leu Asp Asp 
115 120 125 

25 Leu Gly Phe Ser Ala Gly Pro Ala Leu Cys Thr Met Leu Asp Thr Trp 
130 135 140 

Asn Glu Asp Leu Phe Ser Gly Phe Pro Thr Asn Ala Asp Met Tyr Arg 
145 150 155 160 

Glu Cys Lys Phe Leu Ser Thr Leu Pro Ser Asp Val He Asp Trp Gly 
30 165 170 175 

Asp Ala His Val Pro Glu Arg Ser Pro He Asp He Arg Ala His Gly 

180 185 190 

Asp Val Ala Phe Pro Thr Leu Pro Ala Thr Arg Asp Glu Leu Pro Ser 
195 200 205 

35 Tyr Tyr Glu Ala Met Ala Gin Phe Phe Arg Gly Glu Leu Arg Ala Arg 
210 215 220 

Glu Glu Ser Tyr Arg Thr Val Leu Ala Asn Phe Cys Ser Ala Leu Tyr 
225 230 235 240 

. Arg Tyr Leu Arg Ala Ser Val Arg Gin Leu His Arg Gin Ala His Met 
40 245 250 255 

Arg Gly Arg Asn Arg Asp Leu Arg Glu Met Leu Arg Thr Thr He Ala 

260 265 270 

Asp Arg Tyr Tyr Arg Glu Thr Ala Arg Leu Ala Arg Val Leu Phe Leu 
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275 280 285 

His Leu Tyr Leu Phe Leu Ser Arg Glu lie Leu Trp Ala Ala Tyr Ala 

290 295 300 

Glu Gin Met Met Arg Pro Asp Leu Phe Asp Gly Leu Cys Cys Asp Leu 
5 305 310 315 320 

Glu Ser Trp Arg Gin Leu Ala Cys Leu Phe Gin Pro Leu Met Phe lie 

325 330 335 

Asn Gly Ser Leu Thr Val Arg Gly Val Pro Val Glu Ala Arg Arg Leu 
340 345 350 

10 Arg Glu Leu Asn His lie Arg Glu His Leu Asn Leu Pro Leu Val Arg 
355 360 365 

Ser Ala Ala Ala Glu Glu Pro Gly Ala Pro Leu Thr Thr Pro Pro Val 

370 375 380 

Leu Gin Gly Asn Gin Ala Arg Ser Ser Gly Tyr Phe Met Leu Leu lie 
15 385 390 395 400 

Arg Ala Lys Leu Asp Ser Tyr Ser Ser Val Ala Thr Ser Glu Gly Glu 

405 410 415 

Ser Val Met Arg Glu His Ala Tyr Ser Arg Gly Arg Thr Arg Asn Asn 
420 425 430 

20 Tyr Gly Ser Thr lie Glu Gly Leu Leu Asp Leu Pro Asp Asp Asp Asp 
435 440 445 

Ala Pro Ala Glu Ala Gly Leu Val Ala Pro Arg Met Ser Phe Leu Ser 

450 455 460 

Ala Gly Gin Arg Pro Arg Arg Leu Ser Thr Thr Ala Pro lie Thr Asp 
25 465 470 475 480 

Val Ser Leu Gly Asp Glu Leu Arg Leu Asp Gly Glu Glu Val Asp Met 

485 490 495 

Thr Pro Ala Asp Ala Leu' Asp Asp Phe Asp Leu Glu Met Leu Gly Asp 
500 505 510 

30 Val Glu Ser Pro Ser Pro Gly Met Thr His Asp Pro Val Ser Tyr Gly 
515 520 525 

Ala Leu Asp Val Asp Asp Phe Glu Phe Glu Gin Met Phe Thr Asp Ala 

530 535 540 

Met Gly He Asp Asp Phe Gly Gly 
35 545 550 
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What is claimed is: 

I . An isolated polynucleotide comprising a polynucleotide sequence selected from 
the group consisting of: 

(a) a polynucleotide having at least a 70% identity to a polynucleotide encoding a 
5 polypeptide comprising an amino acid sequence of Table 1, 2, 3 or 4; 

(b) a polynucleotide having at least a 70% identity to a polynucleotide encoding a 
mature polypeptide expressed by the gene contained in the HSV-2 of deposited strain VR-2546 
that was sequenced to obtain a polynucleotide sequence of Table 1 , 2 or 3; 

(c) a polynucleotide encoding a polypeptide comprising an amino acid sequence 
1 0 which is at least 70% identical to an amino acid sequence of Table 1 , 2, 3 or 4; 

(d) a polynucleotide which is complementary to the polynucleotide of (a), (b) or 

(c); and 

(e) a polynucleotide comprising at least 15 sequential bases of the polynucleotide 
of(aX(b), (c)or(d). 

1 5 2. The polynucleotide of Claim 1 wherein the polynucleotide is DNA. 

3. The polynucleotide of Claim 1 wherein the polynucleotide is RNA. 

4. The polynucleotide of Claim 2 comprising the nucleic acid sequence selected 
from the group consisting of the nucleic acid sequences set forth in Table 1 , 2 and 3. 

5. The polynucleotide of Claim 2 which encodes a polypeptide comprising an 
20 amino acid sequence sequence selected from the group consisting of the amino acid sequences 

set forth in Table 1 , 2, 3 and 4. 

6. A vector comprising the polynucleotide of Claim 1 . 

7. A host cell comprising the vector of Claim 6. 

8. A process for producing a polypeptide comprising expressing in the host cell of 
25 Claim 7 a polypeptide encoded by said polynucleotide. 

9. A process for producing a polypeptide or fragment thereof comprising 
culturing a host cell of Claim 7 under conditions sufficient for the production of said 
polypeptide or fragment. 

10. A polypeptide comprising an amino acid sequence which is at least 70% 
30 identical to an amino acid sequence selected from the group consisting of the amino acid 

sequences or fragments thereof set forth in Table 1, 2, 3 and 4. 

II. A polypeptide comprising an amino acid sequence selected from the group 
consisting of the amino acid sequences or fragments thereof set forth in Table 1, 2 t 3, and 4. 

1 2. An antibody generated against the polypeptide of claim 1 0. 
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13. An antagonist or agonist of the activity or expression of the polypeptide of 
claim 10. 

14. A method for the treatment or prevention of disease of an individual comprising 
administering to the individual a therapeutically effective amount of the polypeptide of claim 10. 

5 15. A method for the treatment of an individual having medical need to inhibit a 

viral polypeptide comprising administering to the individual a therapeutically effective amount 
of the antagonist of Claim 13. 

16. A process for diagnosing a disease related to expression or activity of the 
polypeptide of claim 10 in an individual comprising 

10 (a) determining a nucleic acid sequence encoding said polypeptide, and/or 

(b) analyzing for the presence or amount of said polypeptide in a sample derived from 
the individual. 

17. A method for identifying compounds which inhibit or activate the polypeptide 
of claim 10 comprising 

15 (a) contacting a composition comprising the polypeptide with the compound to be 

screened under conditions to permit interaction between the compound and the polypeptide to 
assess the interaction of a compound, such interaction being associated with a second component 
capable of providing a detectable signal in response to the interaction of the polypeptide with the 
compound; and 

20 (b) determining whether the compound activates or inhibits polypeptide by detecting 

the presence or absence of the signal generated from the interaction of the compound with the 
polypeptide. 

18. A method for inducing an immunological response in a mammal which 
comprises inoculating the mammal with the polypeptide of Claim 10, or a variant thereof, 

25 adequate to produce antibody and/or T cell immune response to protect said animal from 
disease. 

19. A method of inducing immunological response in a mammal which comprises 
delivering a nucleic acid vector to direct expression of a polypeptide of Claim 10, or a variant 
thereof, for expressing said polypeptide in vivo in order to induce an immunological 

30 response to produce antibody and/ or T cell immune response to protect said animal from 
disease. 

20. The isolated polynucleotide of claim 1 wherein said nucleotide is selected from 
the group consisting of: 

(a) a polynucleotide having at least a 90% identity to a polynucleotide encoding a 
35 polypeptide comprising the amino acid sequence of Table 1 , 2, 3 or 4; 
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(b) a polynucleotide having at least a 90% identity to a polynucleotide encoding the 
same mature polypeptide expressed by the gene contained in the HSV-2 of the deposited strain 
VR-2546that was sequenced to obtain a.polynucleotide sequence of Table 1 , 2 or 3; 

(c) a polynucleotide encoding a polypeptide comprising an amino acid sequence 
5 which is at least 90% identical to the amino acid sequence of Table 1 , 2, 3 or 4; 

(d) a polynucleotide which is complementary to the polynucleotide of (a), (b) or 

(c); and 

(e) a polynucleotide comprising at least 15 sequential bases of the polynucleotide 
of(a),(b),(c)or(d). 

10 21. The isolated polynucleotide of Claim 1 selected from the group consisting of 

(a) a polynucleotide having at least a 95% identity to a polynucleotide encoding a 
polypeptide comprising the amino acid sequence of Table 1, 2, 3 or 4; 

(b) a polynucleotide having at least a 95% identity to a polynucleotide encoding the 
same mature polypeptide expressed by the gene contained in the HSV-2 of the deposited strain 

1 5 VR-2546 that was sequenced to obtain a polynucleotide sequence of Table 1 , 2 or 3; 

(c) a polynucleotide encoding a polypeptide comprising an amino acid sequence 
. which is at least 95% identical to the amino acid sequence of Table 1 , 2, 3 or 4; 

(d) a polynucleotide which is complementary to the polynucleotide of (a), (b) or 

(c); and 

20 (e) a polynucleotide comprising at least 15 sequential bases of the polynucleotide 

of(a),(b), (c)or(d). 

22. An isolated polynucleotide comprising a polynucleotide sequence selected from 
the group consisting of: 

(a) a polynucleotide having at least a 50% identity to a polynucleotide encoding a 
25 polypeptide comprising the amino acid sequence of Table 1, 2, 3 or 4 and obtained from a 

prokaryotic species other than HSV-2; 

(b) a polynucleotide encoding a polypeptide comprising an amino acid sequence 
which is at least 50% identical to the amino acid sequence of Table 1, 2, 3 or 4 and obtained 
from a prokaryotic species other than HSV-2; and 

30 (c) a polynucleotide which is complementary to the polynucleotide of (a) or (b). 

23. An isolated polypeptide having one of the amino acid sequences given in 
Table 1,2, 3 or 4. 

24. An isolated nucleic acid encoding one of the amino acid sequences of 
Claim 1 and nucleic acid sequences capable of hybridizing therewith under stringent 

35 conditions. 
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25. A recombinant vector comprising the nucleic acid sequences of 
Claim 24 and host cells transformed or transfected therewith. 

26. A method of identifying an antiviral compound comprising contacting 
candidate compounds with a polypeptide of Claim 10 and selecting those compounds 

5 capable of inhibiting the bioactivity of said polypeptide. 

27. Antiviral compounds identified by the method of Claim 26. 

28. An isolated polypeptide having an amino acid sequence or fragment thereof 
given in Table 1,2, 3 or 4. 

29. An isolated nucleic acid encoding one of the amino acid sequences of 
10 Claim 28 and nucleic acid sequences capable of hybridizing therewith under stringent 

conditions. 

30. A method of identifying an antiviral compound comprising contacting 
candidate compounds with a polypeptide of Claim 28 and selecting those compounds 
capable of inhibiting the bioactivity of said polypeptide. 

15 
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•C»" document referring to an oral disclosure, use, exhibition or other combined with one or more other such documenu. such combination 

means being obvious to a person skilled in the art 
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Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 

This international report has not been established in respect of certain claims under Article 17{2){a) for the following reasons: 

1. | I Claims Nos.: 

' — ' because they relate to subject matter not required to be searched by this Authority, namely: 

2. | I Claims Nos.: 

' — ' because they relate to parts of the international application that do not comply with the prescribed requirements I 
an extent that no meaningful international search can be carried out, specifically: 

3. I I Claims Nos.: 

— because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule £ 

Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 
This International Searching Authority found multiple inventions in this international application, as follows: 
Please See Extra Sheet. 



As all required additional search fees were timely paid by the applicant, this international search report covers all searchable 



2. As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
of any additional fee. 

3 . fx] As only some of the required additional search fees were timely paid by the applicant, this international search report covers 



only those claims for which fees were paid, specifically claims Nos.: 
1-11, 20-25, 28, and 29 



No required additional search fees were timely paid by the applicant. Consequently, this international search report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 




claims. 





Remark on Protest 




The additional search fees were accompanied by the applicant* s protest. 
No protest accompanied the payment of additional search fees. 
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BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

This application contains the following inventions or groups of inventions which are not so linked as to form a single 
inventive concept under PCT Rule 13.1. In order for all inventions to be searched > the appropriate additional search 
fees must be paid. 

Group I, claim(s) 1-9, 20-22. 24-25, and 29, drawn to Polynucleotide encoding HSV-2 and its first use (making 
protein). 

Group I! , claim(s) 10, 11, 23, and 28, drawn to polypeptide(s) and their second use (diagnosis, second product). 

Group III, claim(s) 12, and 14, drawn to antibodies (third product). 

Group IV, claim(s) 13. and 15 drawn to antagonist or agonist (forth product). 

Group V, claim(s) 16, 17, 26. 27, and 30, drawn to method of diagnosing and identifying polypeptide (fifth product). 
Group VI, claim(s) 18, and 19, drawn to vaccine (sixth product). 

and it considers that the International Application does not comply with the requirements of unity of invention (Rules 
13.1, 13.2 and 13.3) for the reasons indicated below: 

The inventions listed as Groups I-VI do not relate to a single inventive concept under PCT Rule 13.1 because, under 
PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: Groups I- VI 
are directed to isolation and use of nucleic acid of herpes simplex virus type 2 (HSV-2). They are further directed to 
various percent identity of the said virus nucleotide. In addition they are directed to expression and translation of the 
isolated nucleic acids of claim 1 in various diagnostic and pharmaceutical compositions. The claims are linked by the 
disclosed nucleic acid sequences of HSV-2 of claim 1 However, such does not constitute a special technical feature 
because an isolated nucleotide sequence of herpes virus encoding capsid and protease have been reported previously 
Steffy et ai (Journal of General Virology, Vol. 76, 1995) and also Smithkline Beecham Corporation (WO 95/06055). 
The cited evidences prove that the technical feature of Group I, disclosed nucleic acid s sequences of HSV-2, does not 
make a contribution over the prior art. The claims are not so linked by a special technical feature within the meaning 
of PCT Rule 13.2 so as to form a single inventive concept, accordingly, the unity of the invention is lacking among all 
groups. 
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